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Statistical Analysis and Modeling of the 
High-Energy Proton Data From the 
Telstar® 1 Satellite 


By J. D. GABBE, M. B. WILK and W. L. BROWN 
(Manuscript received October 4, 1966) 


This paper deals with the analysis of data from the omnidirectional 
high-energy proton detector on the Telstar® 1 satellite. The main accom- 
plishment ts the development of relatively simple (empirical) mathematical 
models which give a statistically accurate representation of the measured 
spatial distribution of intensity of protons with energies between 50 and 
130 MeV. 

These models depend upon the fitting of 8 (or 9 or 10) coefficients based 
on samples containing approximately 1000 of the nearly 80,000 experi- 
mental observations. The nature of the model for the average omnidtrec- 
tional counting rate permits its closed form transformation to the equivalent 
equatorial pitch angle distribution. 

Sufficiently accurate fits were achieved so that the residuals (equal to 
observed minus fitted) could be productively examined for possible depend- 
ence on variables other than the two magnetic coordinates used in the 
fitting. One consequence of this was the detection of instrumental suscep- 
tibility to temperature and bias voltage changes, which led to an objective 
partitioning of the data. 

The present paper has several evolutionary aspects: In particular, a 
sertes of one-dimensional fits was employed as a base for developing a 
two-dimensional medel; a preliminary analysis of all the data was used 
to guide the rejection of outliers; a first two-dimensional fit to all the data 
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led to a data-independent basis for partitioning the data; the mede of 
selection of a sample of data, to which the two-dimensional model was 
fitted, changed as deeper insight into the importance of this issue developed; 
and, after a very satisfactory fit to the data was attained, the model was 
improved by specialization and reparameterization so as to overcome some 
statistical defects and to achieve greater physical meaning. 

The data cover the time period between July 1962 and February 1968, 
and the spatial region bounded by 1.09R, SRS1.95 R,and0 Sd < 58°. 
Flux maps having a relative accuracy of about two percent are derived 
from the fit and presented. The temporal behavior of the intensity is ex- 
amined and some changes are noted. The maximum value of the omni- 
directional flux of protons with energies between 50 and 180 MeV ts found 
to be [5.773:4] X 10° protons/cm’ sec at L = 1.46 on the magnetic 
equator, tn good agreement with other experiments. Relative flux values 
and energy spectra are consistent with the generally accepted picture of 
the proton distribution. 
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I, INTRODUCTION 


This paper deals with the analysis of data from the omnidirectional 
high-energy proton detector on the Telstar® 1 satellite. The main ac- 
complishment is the development of a relatively simple (empirical) 
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mathematical model which gives a statistically accurate representation 
of the measured spatial distribution of protons with energies between 
50 and 130 MeV. 

The Telstar® 1 satellite was launched into a 45°-inclined orbit with 
an apogee of 5600 km and a perigee of 950 km on day 191 (July 10), 
1962. The period of precession of the apsis was 180 days. The satellite 
was instrumented to measure fluxes of energetic particles; in particu- 
lar, counting rates of protons with energies above 50 MeV were re- 
corded. Two thousand hours of telemetry was received during the ac- 
tive life of the satellite, which terminated on day 52 (February 21), 
1963. The satellite and associated systems have been described in de- 
tail. The particle-detection instruments have been documented? and 
some of the experimental results have been presented.* * ® 

The above-mentioned presentations of information concerning the 
earth’s radiation belts have been principally graphical in format, ow- 
ing to the complexity of the belts and the limited understanding of the 
details of the processes affecting them. 

An accurate analytical representation of the data would enable con- 
venient interpolation, extrapolation, and transformation. Thence it 
would be practical to make extensive comparisons with the results of 
other experiments and with various theoretical predictions and to sum- 
marize, analytically, such features as the equatorial omnidirectional 
counting rate and the approximate size of the equatorial loss cone. In 
addition, an empirical mathematical model would facilitate the study 
of temporal fluctuations in various regions of space. Of course, a good 
analytical representation, even though empirical, may also stimulate 
deeper physical insight and theories. 

The present study was directed toward the development of a math- 
ematical function which would, when fitted to the data, provide a con- 
venient, concise and precise summary description. The mathematical 
model(s), which are herein presented, were empirically evolved, using 
the knowledge that the intensity distribution of these protons is, in 
the main, not rapidly variable in time. Even more specifically, the 
assumption has been that fluctuations in observed counting rates at a 
fixed point in space relative to the earth are independent random vari- 
ables. Further, the main effort of the analysis has been to try to relate 
the observed counting rates to a two-dimensional magnetic coordinate 
system derived from three-dimensional spatial coordinates by mapping 
the known earth’s magnetic field onto the field of a magnetic dipole.® 

The mathematical models which are used depend upon fitting of be- 
tween 8 and 10 coefficients based on samples containing approximately 
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1000 of the nearly 80,000 experimental observations. The nature of 
these models for the average omnidirectional counting rate permit 
their closed-form transformation to the equivalent equatorial pitch 
angle distribution. 

The fitted models were sufficiently accurate so that the residuals 
(equal to observed minus fitted) of all the data could be productively 
examined for possible dependence on variables other than the two mag- 
netic coordinates used in the fitting. One consequence of this was the 
detection of instrumental susceptibility to temperature and bias volt- 
age change, which led to an objective partitioning of the data. 

This article summarizes some of the productive aspects of the anal- 
ysis of this body of data. A very large amount of “preliminary” work 
is not reviewed. Though not an historical description of the work, the 
present paper does have several evolutionary aspects. In particular, a 
series of one-dimensional fits were employed as a base for developing 
two-dimensional models; a preliminary analysis of all the data was 
used to guide the rejection of outliers; a first two-dimensional fit to 
all the data led to a data-independent basis for partitioning the data; 
the mode of selection of a sample of data, to which the two-dimen- 
sional model was fitted, changed as deeper insight into the importance 
of this issue developed; and, after a very satisfactory fit to the data 
was attained, the model was improved by specialization and reparam- 
eterization so as to overcome some statistical defects and to achieve 
greater physical meaning. 

Readers with specific interests may wish to consult the Table of 
Contents, the summary (Section XIV) and the following overview for 
guidance. 

Section II introduces the input data which have been analyzed. Co- 
ordinates and notation are tabulated, the distribution of the data is 
displayed, and the general quality and stability of the data are dis- 
cussed. It is shown informally that the measurements may be usefully 
organized in the dipole magnetic coordinate system used. 

In Section III, various alternative coordinate systems and scales are 
considered. The bases for choice of the z,L coordinate system for the 
independent variables and the square-root-of-counting-rate scale for 
the dependent variable are given. 

Section IV brings together the ideas underlying the formulation and 
evolution of the models, and gives mathematical definitions and details. 
Some properties of the models which make them suitable smoothing 
functions for this body of data are indicated. 

One-dimensional fits to the data in each of several L-slices (an 
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L-slice is a particular grouping of the data) are displayed on several 
scales and discussed in Section V. It is shown that L-slice fits suffer 
from fundamental deficiencies, in addition to being inconvenient to 
work with. The results of the L-slice fits are used to lead to a two- 
dimensional model. 

Section VI contains the treatment of the preliminary fit of a two- 
dimensional model. This fit is of good quality and provides residuals 
which are used to help identify and eliminate extraneous sources of 
variability in the data and to serve as a basis for more refined sample 
selection. 

The treatment of the two-dimensional fit to the data after it has 
been partitioned to reduce instrumental effects appears in Section VII. 
The method of sample selection is important, and some algorithms and 
their influence on the resultant fits are considered in Section 7.1. The 
advantages of selecting a sample on the basis of a preliminary fit are 
discussed. The fit itself is described and evaluated in the remainder of 
the section. 

A more detailed statistical critique of the fit discussed in Section 
VIL is contained in Section VIII; in particular, some remaining phys- 
ical and statistical defects are pinpointed. 

Section IX deals with a modified version of the model, which elimi- 
nates the remaining defects, and gives the results of fitting the most 
satisfactorily parameterized model of the proton distribution. 

Residuals are used to study temporal effects in Section X. An in- 
crease in intensity near L = 2 is noted during October, 1962. An upper 
limit of 0.003 gauss is found for the diurnal variation of the earth’s 
magnetic field near L = 1.5. A possible shift in the location of the 
atmospheric cutoff is examined. 

The behavior of the radiation belt near the top of the atmosphere is 
the subject of Section XI. Although the data do not allow the position 
of the low-altitude cutoff to be accurately determined, the qualitative 
behavior precludes a simple atmospheric cutoff mechanism. 

Section XII is devoted to a comparison of the Telstar® 1 results, 
presented as flux maps, with those obtained on Injuns 1 and 38, Ex- 
plorers 4 and 15, and other satellites. Absolute flux values agree to 
within a factor of 2 in most cases, which is as well as can be expected. 
Very good agreement exists concerning the behavior of the intensity 
in the equatorial plane, on L-shells, and near the top of the atmos- 
phere. Experimental results regarding the equatorial pitch angle (see 
Fig. 1) distribution are found to agree well with each other, but differ 
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appreciably from the published results of theoretical calculations. 

Section XIII gives brief consideration to possible directions in which 
this work might be extended: improving the fit to the Telstar® 1 high- 
energy protons still further; approaching model development differ- 
ently; employing the data more fully; and encompassing other more 
complex bodies of data. 

Section XIV contains a brief summary of the results and Section XV 
contains acknowledgments. 

Appendix A provides a detailed description of the particle detector 
and its calibration. 

Appendix B gives some statistical background and details of the 
analysis, and Appendix C discusses statistical measures of the good- 
ness of fit of the model over all the partitioned data. 
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Fig. 1— Magnetic coordinates of the point P. The spiral is the orbit of a 
particle trapped on the magnetic line of force L = 2.5 and mirroring at B = 
0.0266 gauss. The equatorial pitch angle, ao, is the angle between the velocity 
vector and the magnetic ficld vector at the equator. 
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II. THE DATA 


The data which are studied in this paper were obtained with a 
detector on the Telstar® 1 satellite which measured protons with energies 
greater than 50 MeV. The sensitive detecting element is a semiconductor 
diode developed specifically for satellite experiments.’ The effective 
geometric factor, g, of the detector depends upon proton energy, but 
over the region energy between 50 and 130 MeV the average geometric 
factor, g, is relatively insensitive to the energy spectrum and an ap- 
proximate value of 0.143 cm” steradian has been selected. These con- 
siderations are discussed in detail in Appendix A. The response of the 
detector is also dependent upon both temperature and electrical bias 
because of changes in the effective thickness of the active region of the 
detector. These effects are discussed in Section 6.8. 

The primary input to our data reduction process consisted of: the 
telemetry record of the number of counts measured by the detector 
in an 11-second counting interval once every minute; the time at which 
the data were recorded (inserted by the recording station); and the 
ephemeris of the satellite position obtained from tracking data. These 
are supplemented by the satellite spin-axis orientation obtained from the 
mirror flash data*® and by telemetered measurements of the satellite 
skin temperature near the detector and of the detector bias voltage. 

During data reduction, the square root of the counting rate was 
computed for each recorded particle-counting interval and associated 
with the following information: date and time, geographic position, 
position in the earth’s magnetic field, orientation of the detector relative 
to the magnetic field, bias voltage, and skin temperature. 

The model developed in the present paper is based on the use of 
a two-dimensional magnetic coordinate system, in which the earth’s 
magnetic field is mapped onto an axially symmetric dipole field using 
the adiabatic invariants of particle motion.” Any of a number of equiv- 
alent pairs of magnetic coordinates, including the B,L; R,\ and 2,L 
sets” may be used to locate position in this dipole field. Briefly: The 
magnetic shell parameter, Z, specifies a particular line of force (about 
which the trapped particle spirals) by the radial distance to the line 
in the equatorial plane of the dipole measured in units of one earth 
radius (see Fig. 1); position along the line of force is specified by either 
the magnetic induction (field strength), B, or by x, where x= (1—B)/ B)} 
is & convenient variable in the equations of the dynamics of charged 
particle motion. (By is the magnetic induction at the equator on the 
line of force in question.) Magnetic dipole polar coordinates RF and }, 
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MAGNETIC INVARIANT EQUATOR 





Fig. 2— The spatial distribution of data for L < 3 in R,d coordinates. Every 
twentieth point from the L-ordered data is plotted. 


where # is the radial distance in earth radii and ) is the latitude angle, 
offer a sufficiently close analog to geographic coordinates to be con- 
venient in many circumstances. The choice among these sets is dis- 
cussed in Section III, as are the reasons for choosing the square root 
of the counting rate as the scale for the dependent variable. 

The coordinates and variables, together with other symbols used 
in this analysis, are listed in Table I under the following headings: 
Radiation Intensity, Position and Orientation, Instrument and Energy 
Spectrum, Mathematical Model, Statistics, and Other. Summary in- 
formation concerning units, constants, derivations, and sources is 
included. 

The satellite was confined to the volume of space {1.09 R. S 
Rs195R,,*0 Ss) S 58°}. For {L > 3, R < 1.95 R,}, the average 
counting rate is very nearly zero, and these data were not examined 
further. About 5 percent of the 50-130 MeV. proton data for L S 3 
were associated with noise bursts which affected adjacent telemetry 
channels; these data were discarded. The study described below is 
based on the remaining 77,649 observations. 

The spatial distribution of the data is indicated in Fig. 2 which is 


*R. = earth radius. 
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a plot in R,A coordinates of the position of every twentieth point from 
the L-ordered data. Although data were not acquired continuously 
during the 226 days that the satellite was active, there are no time 
gaps in the data longer than two days in duration. 

Fig. 3 is a plot of bands of constant counting rate made by plotting 
the R,A coordinates at which certain specified numbers of counts were 
recorded during 11-second counting intervals. The data in Fig. 3 cover 
the entire seven-month life of the satellite. The narrowness of the con- 
tour bands demonstrates that the data are exceptionally well-behaved 
in both time and space, and that one may reasonably hope to describe 
radiation intensity in terms of R,A coordinates or their equivalent. 

Among the various sources of error in the data are: noise present 
in the received telemetry signal or introduced during the recording and 
processing of the telemetry; errors in the time as recorded by the 
ground station; errors in the satellite ephemeris; differences between 
the real magnetic field of the earth and the values of B and LZ calcu- 
lated from the coefficients in the computer program INVAR (see Table 
I); and instrumental effects. In addition, one expects statistical fluc- 
tuations in the measured counting rate at a fixed position. The im- 
portance of these sources of error is discussed later. 


MAGNETIC INVARIANT EQUATOR 





Fig. 3— Bands of constant numbers of counts in 11 seconds in R,» space: 
Band a, 4; Band b, 32; Band ce, 127-129; Band d, 254-258; Band 3, 508-516 counts. 
All the data from the seven-month period are displayed. 
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III. CHOICE OF THE PRINCIPAL VARIABLES AND THEIR SCALES 


The current state of knowledge of the earth’s radiation belts sug- 
gests that the spatial distribution of high-energy protons may reason- 
ably be organized on the basis of a two-dimensional magnetic coordi- 
nate system, except perhaps at very low altitudes near the South 
American magnetic anomaly, where longitude also becomes important. 
Telstar® 1 data plotted in Fig. 3 indicates that the observed counting- 
rate data does indeed depend principally on the magnetic coordinates, 
FR and 2. The coordinates R,v are defined in terms of the mathemat- 
ically equivalent pair B,L.2 A third equivalent set consists of L to- 
gether with the coordinate x, suggested by Roberts,’ defined in Table I. 

We have primarily employed the 2,Z set in this study because of 
the following considerations: In the adiabatic theory, the mirror points 
of particles do not migrate between magnetic shells.1t Within any shell, 
the coordinate x is approximately linear in d for A < 30°, and thus the 
near-equatorial data is not “crowded” into a small interval of the 
coordinate, as is the case for B. Moreover, we have been able to de- 
velop simple functional representations of the data in terms of x and L. 

The flux of particles is the variable of greatest physical interest for 
comparing the results of different experiments, calculating physical 
effects of the radiation (such as radiation damage to devices in pro- 
posed orbits), deriving an energy spectrum from experimental meas- 
urements, examining the implications of various source and loss mech- 
anisms, etc. However, the flux is not measured directly and requires 
for its calculation knowledge of the energy spectrum of the particles 
and of the energy dependence of the geometric factor of the detector. 
Even in the present circumstances where the conversion is (under the 
assumptions of Appendix A) quite insensitive to these, we prefer to 
carry out the bulk of the data analysis in terms mathematically equiv- 
alent to the directly observed counting rates. 

From among the possible representations of the counting rate in- 
formation (including counting rate, log counting rate, and square root 
of counting rate) the square root of the observed counting rate, Y, has 
been selected as the dependent variable. On the hypothesis that the 
number of counts in a given 11-second counting interval at any given 
position in space is a random variable with a Poisson distribution, it 
can be shown that the variance of Y is approximately constant, inde- 
pendent of its average value (see Appendix B.2). The least squares 
criterion has been used in all the estimating procedures; that is, coeffi- 
cient estimates have been selected so that the sum of squares of dif- 


TaBLE I—CoorpINATES, VARIABLES AND NOTATION 


The redundent use of a few symbols is partly due to the decision to retain “standard” notation in both geophysics and statistics. 
The context should resolve any apparent ambiguities. Some symbols used locally in the text are not included in this table. 





1 



































Underlying 
Symbol Coordinate Units Source variables Remarks 
Radiation Intensity 
J Fitted average omnidirec- protons/cm? sec Y, g Equation (21). 
tional flux 
j Predicted unidirectional protons/cm? sec y, 9 Equation (8). 
flux ster 
King Square root of observed (counts /sec)1/? telemetry Z 
counting rate 
y Fitted average value of Y (counts/sec)!/2 least squares fit | Y, 2, L Section [V. This symbol is 
used generically for all 
the models. 
Z Counts in an 1]-second counts telemetry Random variable. 
counting interval 
Position and Orientation 
B Magnetic induction gauss INVAR r, 0, Computer program 
INVAR by Mellwain® 
containing the Jensen 
and Cain’? magnetic 
field coefficients for 
1960. R, = 6371.2 km. 
B Magnetic field strength gauss [BI r, 0,¢ tee 
By Equatorial value of B gauss B, = 0.311653/L3 
= 0.311653/R?3. 
L Magnetic shell parameter ratio to earth INVAR r, 0, See B, above. 





radius (2?.) 


clél 
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Underlying 








spin axis 


data 





astronomical] 
data 








Symbol Coordinate Units Source variables Remarks 

Position and Orientation (Cont’d) 

Lim Midpoint of an L-slice same as L Section V. 

es) sae ote 0.311653 f 3R\" 

R Maree dipole radial dis- | R, B,L r, 0, B ar sae Ef 

R. Earth radii km Heiskanen!* tee 6371.2 km. 

r Geocentric distance earth radii (22.) | ephemeris tracking data For geomagnetic calcula- 
tions, r is corrected to 
altitude above the 
International Ellipsoid 
[Heiskanen“]. R, = 
6371.2 km. 

fi Universal time days clock at telem- Measured in days from 0 

etry receiv- hr 0 min. U. T., Jan. 0, 
ing station 1962. 

t Local time hours ee T,¢ Apparent sun time (local 
mean time corrected for 
the equation of time 
taken from the Ameri- 
can Ephemeris and 
Nautical Almanac!*). 

3116537]! 

x [1 = ae dimensionless B,L r, O,¢ See above for B and L. 

Qo Equatorial pitch angle degrees Fig. 1. 

¥ Angle between satellite degrees B, B, x, 6 y = B-w/|B], where wis a 

spin axis and local mag- unit vector parallel to 
netic vector the angular momemtum 
vector of the satellite. 

5 Declination of the satellite | degrees mirror flash r, 0, ¢, T, and Optical observations of 


the reflection of the sun 
from mirrors on the sat- 
ellite, Courtney-Pratt, 
et al’. 
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| | | Underlying | 
Symbol Coordinate Units Source variables Remarks 
Position and Orientation (Cont’d) 
6 Colatitude degrees ephemeris tracking data geocentric angle. 
r Magnetic dipole latitude degrees : r, 0, X = [are cos(R/L)}"? 
Ho Cos ao dimensionless see see Numerically equal to z. 
East longitude degrees ephemeris tracking data Geocentric angle. 
x Right ascension of the degrees mirror flash r, 6, ¢, T, and See 6, above. 
satellite spin axis data astronomical 
data 
@ Direction of the spin dimensionless mirror flash x, 6 See 6, above. 
angular momentum data 
vector of satellite 
Instrument and Energy Spectrum 
E Energy MeV vee 
Eo e-folding energy MeV Used in energy spectrum, 
Appendix A. 
g Geometric factor of the cm? ster detector geome- see 
detector try 
G Average geometric factor cm? ster detector geome- | proton energy Equation (20). 
of detector try spectrum 
M Exponent of integral dimensionless see ee 
power-law energy 
spectrum 
N Number of protons dimensionless 
n Exponent of differential dimensionless 
power-law energy 
spectrum 
Up Bias voltage bits telemetry resistor calibra- | Each bit represents a step 
tion of —1.108 volts. 
T Skin temperature °C telemetry thermistor cali- | Measured near the de- 


bration 


tector. 


L9G. AAANALGTS “IVNUNOL ‘IVOINHOGL WALSAS TIA AHL 


TABLE I— (Cont'd). 


Symbol Coordinate | Units | Source 


Mathematical Model 


ae A”, AM Equatorial value of y (counts/sec)!/2 fitting 
Ay Coefficient (counts/sec)!/ fitting 
1, G2, 23, 4, a5 | Coefficients fitting 
b tee dimensionless 

es Coefficients : oe 

F see dimensionless 

G;G, G",.G’ dimensionless 











Underlying 
variables 


x, (L) 
L 


L, (x) 
a, (L) 


Remarks 


The superscripts indicate 


various models, see Sec- 
tion IV. In particular 
A’ indicates Model I 
and A” indicates Model 
II. N.B. A is used 
generically for all the 
models, or when the 
distinction is unimpor- 
tant or clear from the 
context. 

Maximum value of A” 
(and therefore y’’), 
a II, Equation 
11). 

Coefficients of A” and 
A, Equations (6) and 
16 


Equation (18). 

Equation (18). 

Equation (19). 

Describes the z-depen- 
dence of y for the 
models indicated by the 
superscripts, see Sec- 
tion IV. N.B. G is used 
generically for all the 
models, or when the 
distinction is unimpor- 
tant or clear from the 
context. 
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Symbol 


Coordinate 


Mathematical Model (Cont'd) 





Do 
Lp 
Ly 


M 
Pr 


Pi 

Q 

R. 

T1, Ta, 13, (r4), 
rs 


So, $1 


Ce 


Ue Yn, gg 


/ 





Coefficient 
Coefficient 
Coefficient 
Coefficient 
Coefficient 
Coefficients 
Coefficient 
R at cutoff 
Coefficient 


Shape factor 
Coefficient 


Cutoff function 


Fitted average value of y 


Fitted value 
Coefficient 


Units 


same as L 
same as L 
same as L 
dimensionless 
dimensionless 
dimensionless 
e 
dimensionless 


dimensionless 


(counts /sec)!/2 


(counts/sec)!/? 


dimensionless 


Source 


fitting 
fitting 
fitting 
fitting 
fitting 
fitting 
fitting 
fitting 


fitting 
fitting 


fitting 


fitting 


fitting 
fitting 


Underlying: 


Siekols 


ia 


xz, L 








Remarks 


Smallest value of ZL for 


which y > 0. 

Position of A, is at (z, L) 
= (0, Ly). 

Equation (16). 

Model III, Equation (15). 

Model ITI, Equation (15). 

Equation (19). 

Model III, Equation (15). 

Equation (4). 

Coefficients of R., Equa- 
tion (5). 

Equation (3). 

Coefficients of S, Equa- 
tion (3). 

Smallest value of zx for 
ia y = 0, Equation 

4). 

The subscript and super- 
scripts indicate various 
models, see Section IV. 
In particular y’ indi- 
cates Model I and 7” 
indicates Model II. 

N. B. y is used generically 
for all the models, or 
when the distinction is 
unimportant or clear 
from the context. 

Corresponds to the obser- 
vation Y;. 

Shape factor, Equations 
(6) and (11). 
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Units | Source 





Symbol Coordinate Remarks 
Other 
CB Complete body (of data) Designates all the data, 
see Section 4.5. 
HTB High temperature and high Designates a subset of the 
bias voltage (data) data, see Sections 4.5 
and 6.9. 
Statistics 
Cov Covariance tee tee 
df Degrees of freedom dimensionless See Wilks!®, 
D/?, D+? Squared distance tee Appendix B.6. 
g Function tee 
h Function 
n Number of observations 
ie Squared multiple correla- 
tion coefficient 
Res, RES Residual Observed minus fitted. 
SS Sum of squares toe 
Uj Function of Y; 
a Mean of wu; 
Var Variance Wilks!®, 
w Independent variable 
(vector) 
Wi Components of w . 
Zz Values of the random counts 
variable Z 
a Dependence coefficient a=(1— V]— p,2sign(p), 
Wilk”, Equation (31). 
B; Confidence coefficient Wilks}6, 
3 Correction to @ (vector) 
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Underlying 

Symbol Coordinate Units | Source | Variables | Remarks 
Statistics (Cont’d) 
5: Component of $ | | 
6, 6" Coefficient vector 
6: | Component of 6’ 
é, Estimate of 6’ re 
0; Components of 6 ; see Estimates of 6;. 
v Average value of a Poisson | counts see 

variable 
p Correlation coefficient dimensionless 
o Standard deviation tee. 
MSD The terms, mean square error (MSE), mean square residual (MSR) and mean square deviation (MSD) are used 
MSE in this document to denote related but different entities, each measuring ‘goodness of fit” in relation to different 
MSR situations. When a selected array of data is fitted by a model, the minimum sum of squares of residuals from the 


fit of those data divided by the degrees of freedom (number of selected observations minus number of coefficients 
fitted) is termed the MSE. When a fit based on a sample of data is used to generate residuals for all of the data, 
without refitting, the total sum of squares of these residuals divided by the number of residuals is termed the MSR. 
For defined “small” cells in z,Z space, the sum of squares of deviations of observations from their average in the 
cell divided by the number of such deviations minus one is termed the MSD. 
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ferences between observed and fitted values is minimized. The choice 
of the square root scale, Y, as the scale on which to represent the 
counting rate data makes troublesome differential weighting of the 
data in the least squares fitting unnecessary. Similarly, plots of Y 
versus various variables are convenient since the scatter in Y is ap- 
proximately independent of the value of Y. In fact, the square root 
transformation will make the variance of the observation approxi- 
mately independent of its average value whenever the variance is pro- 
portional to the mean. Thus, the procedure is more robust than the 
assumption of a Poisson distribution, for which the variance equals 
the mean. Further discussion and detail is given in Appendices B.2 
and C. 

The results were restored to counting rate and the flux was calculated 
using the best estimate of the average geometric factor, g, (see Appendix 
A) to facilitate the discussion of the physical significance of the meas- 
urements. 


IV. THE EVOLUTION OF THE MODELS 


4,1 General Approach 


This section provides a summary overview of the evolution of the 
models, the details and accomplishments of which are elaborated in 
the following sections and appendices. 

The approach to model development in this study has been largely 
empirical. Theoretical physics considerations are currently too com- 
plex and speculative to do more than serve as a general guide and 
stimulus. We have proceeded on the presumption that an adequate 
model for the spatial distribution of the high-energy protons can be 
based on the mapping of the earth’s magnetic field onto a two-dimen- 
sional axially symmetric dipole field, expressed, for example, in the 
coordinates xz and L. This is supported by the plots of Fig. 3, the suc- 
cessful polynomial fits on Z-lines of MclIlwain,'® Valerio,!® and Fil- 
lius,?° and by the results of the present study. 

The ultimate justification of the mathematical models developed 
herein is that, when appropriate estimates of coefficients are inserted, 
good fits to the data are obtained. Various other mathematical, phys- 
ical, and statistical considerations also provided guidance and evalua- 
tion. 

The evolution involved successive interactions with the data and 
iteration on models. Roughly, the main stages included: grouping the 
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data into L-slices; inferring a mathematical function having adjustable 
coefficients which would fit a selected series of L-slices; developing a 
mathematical function to describe the dependence of the L-slice coeffi- 
cients on L; thence fitting the two-dimensional model so-defined to a 
sample of the data; using this fit to screen outliers, to detect instru- 
mental effects and, after partitioning the data, to select a representa- 
tive sample of partitioned data for further fitting; after obtaining a 
very good fit to the partitioned data, some remaining physical and 
statistical defects of the model were overcome by a reparametrization 
and specialization. Further generalizations of the model were also 
tested. 


4.2 The L-slice Model 


As a developmental operational procedure (encouraged by the L-shell 
orientation of the adiabatic theory’’) the data were grouped into a 
series of narrow bands according to Z values (e.g., 1.849 < LZ S 1.851) 
and plotted versus «. Retrospectively, there is every reason to believe 
that an initial approach based on grouping the data into x-slices would 
also have led to an effective analysis (see Section 13.2). Various func- 
tional forms, having adjustable coefficients dependent on L, were tested 
for adequacy of fit to the L-slices. 

Initially, we employed the functional form 


yu(t) = aia aa ae (1) 
(v > #.), 


where A, x, and S are fitted coefficients for each L-slice, and 


la _ ey = (“yy 


Gaz; z., 8) = | 
0 (ES 2) 


IIA 


Xe) 


(2) 


For this body of data from the region {R $ 1.95 R,,1.15 S$ L S 3.0}, 
we have found this y,(x) function provides an adequately flexible model 
on L-slices, for appropriately fitted values of the coefficients A, x., 
and S. In this representation for given fixed L, the quantity A* may be 
interpreted as the average equatorial omnidirectional counting rate, 
since x = O on the equator, x, represents a “cutoff” value for a, i.e., 
the cosine of the equatorial pitch angle corresponding to the “loss cone’’, 
and S has the effect of a shape factor in the y,xz dependence. 
The analysis using this y, (x) model is described in Section V. 
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4.3 Dependence on L 

The yr(z) model was fitted to a series of L-slices, obtaining fitted 
values of A, x, and S. These were each plotted against the nominal 
(mid-range) L value for the slice and a reasonably smooth variation 
with LZ obtained. 

Thence we inferred the following functional dependence of the L- 
slice coefficient estimates on L: 


S = SCL) = % + 3b, (3) 


a, = 2(L) = ah = (%)[4 = ae (4) 


R, =R(L) = Ll, +r(L — Ly) + 7(L — Ly)’ +73(L — Ly)’, (5) 


a,(L = Ly) 
ay + (L — a;)" 
0 (L < Ly), 


where 80, 81, 71, 2, 3, Q1, 42, Ag, y and Lo are fitted coefficients. 

Equation (4) simply expresses the mathematical relationship be- 
tween R (or R,) and x (or 2.) in the magnetic dipole field (see Table 
I). The coefficient Zo, which occurs in A’(Z) and x,(L), may be inter- 
preted as the lower bound of the ZL shells on which protons with ener- 
gies above 50 MeV were measurable. The quantity R,(L) is such that 
R,(L) — 11s the equivalent dipole altitude at which the counting rate 
falls to zero. 





(L = Ty), 


A= AL) = (6) 


4.4 A Two-Dimensional Model—M odel I 


The conjunction of (1) to (6) defines a two-dimensional model, re- 
ferred to henceforth as Model I, 


y'(z, L) = A'(L)-G'@, x(L), SZ), (7) 


where G’ is essentially the function G of (2), with x, and S explicitly 
dependent on L. 

Though empirical considerations mainly guided the choice of these 
functions, some physical and mathematical properties influenced the 
. choice. In the present case, in which the geometric factor of the de- 
tector is considered to be independent of the energy spectrum (see 
Appendix A), [y(z, L)]? transforms in closed form to the equatorial 
pitch angle distribution, giving?® 
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2)2S8S(L) 
2 Lo 
ay ont [ats 
ae G  na(L)BG,1+25(D)) ? 


where j(o, Z) is the predicted equatorial unidirectional flux (protons/ 
cm? sec ster) at equatorial pitch angle ap = are cos po, and £ is the 
beta function, 





aly, @) = f wd — wt du. ©) 


In addition 7/(z, L) has good boundary behavior. The derivative at 
the magnetic equator, dy’(0, L)/dz, is 0, which provides continuity. 
When } < S(Z) < 3, then dy’(2,, L)/dx > —o and d[y' (x, L)]?/ 
dx = 0. The estimated values of S do satisfy this constraint in the 
present case. The desirable consequences of this behavior of the de- 
rivatives will be discussed in Section V. The function y’(z, L) gives 
smooth interpolation over regions sparse in data, and does not have 
any of the wild fluctuations often associated with polynomial fits. 

The analysis of the data using Model I is described in Section VI. 


4.5 Summary Uses of Model I. 

The unspecified coefficients of Model I were estimated by nonlinear 
least squares fitting to a sample of about 1000 observations from the 
complete body of data. Thence this fit of Model I (the CB fit) was 
evaluated relative to all the data and to auxiliary variables, such as 
time, which were not included in the model. Outliers were thereby de- 
tected and screened. An instrumental effect was uncovered (see Section 
6.8), and this led to an objective partitioning of the data, yielding a 
subset (HTB data) for further analysis. The CB fit of Model I was 
also used to specify a representative data sampling procedure for fur- 
ther fitting to the HTB data. 

Though Model I produces a very good fit to the HTB data (see Sec- 
tion VII), it has certain physical and statistical defects. Specifically, 
though the quantities A and x, in the L-slice model have a direct phys- 
ical interpretation, most of the coefficients in y’(x, ZL) do not. Addi- 
tionally, the estimates of the coefficients in A’(Z) turn out to have 
exceedingly high statistical correlations and the model y’(z, LZ), as a 
function of the coefficients, exhibits marked nonlinearities even in a 
close neighborhood of the least squares estimates (see Section 8.5). 

Therefore, after clarifying the character of the data and obtaining 
a good fit, attention was given to additional improvements of the 
model. 
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4.6 A Modified Afodel—M odel II 


The statistical difficulties of Model I were entirely overcome by em- 
ploying a specialized version of A’(Z), defined below. Furthermore, 
this specialized model, Model II, retains all the desirable properties 
of Model I while providing both aesthetic improvement and greater 
physical interpretability. 

Model II is defined by 








y"'(2, L) = ANZ)- 4a, x(L), S(Z)), (10) 
where G” is as in (2), but with S(Z) = so, and 
A,(L = L,) (L > ty 
seek WEI oe 2 [(Ly + L — 2L,)/27 ee 
A’(L) a ae ee (11) 
0 (L < Ly), 


where A,, Lo, L, and 7 are the coefficients to be estimated. 
A’ (LZ) is a special case of A’(Z) and relates to it by the following 
transformations: 


Ly = Ly 
7 = 17 
a; = 2L, — L, (12) 
ad, = 2"""(n — 2), — Ly)" 
a, = 2" Asn(L, — Ly)". 


Indeed, Model II is essentially defined by the following nonlinear con- 
straint imposed on Model I: 


G, = 2" "(q — 2)(Lo — as)’. (13) 
The coefficients of A’ (Z) in Model II have the following physical 
interpretations: 


L, (as before) is the smallest value of Z such that high-energy 
protons are measurable by the instrument; 

A, is the square root of the maximum counting rate of high-energy 
protons in the radiation belt; 

L, is the value of the magnetic shell parameter (on the equator, 
x = 0) at the highest radiation intensity; 

n may be interpreted as a shape factor for the equatorial (counting 
rate)’ function, A’ (ZL). 
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The model A”’(Z) has the form of a product, with the maximum 
value, A,, being multiplied by a factor which decreases as L departs 
from ZL, in either direction. Note that the factor multiplying A, is 
dimensionless. 

The other fitted coefficients of Model II are sy, which is a shape fac- 
tor for the dependence of (counting rate)? on x at constant L, and 71, re 
and rs which, with Lo, define the cutoff function x, (LZ). 

The analysis of the HTB data using Model II and comparisons of 
Models II and I are considered in Section IX. 


4.7 Generalizations 


The previously defined models may be regarded as special cases of 
Model III defined by 


y/"(a, L) = A(L)-G'"(a, x. (LZ), M(L), PL), Q(L)), (4) 
where A” (L) = A’(L), defined in (6), 


gia =Ga) | @ae eas) 


0 ea), 


x,(Z) is as defined in (4), and M(P), P(L) and Q(L) involve coef- 
ficients or functions to be fitted. 

The function G’ is a special case of G”’, in which M(L) = 2 and 
Q(L) = 4. This permits a closed form transformation to an equatorial 
pitch angle distribution. The function G” additionally constrains P(L) 
= So, independent of L. 

The more general G’” in Model III can be used on FL slices to de- 
termine L-slice estimates of M, P, Q, as well as A and x, and these 
in turn inspected to infer functional dependence on ZL. Clearly, this 
more general form must lead to at least as good a fit as Models I or 
II. Work has been done with Model III** but no important improve- 
ment over Model II was obtained for this body of data. 

Neither of the fitted models y’(z, Z) nor y’” (x, L) is applicable far 
outside the spatial and energy regions that include the data analyzed 
here. For example, Models I and II do not fit well to the 26-33 MeV 
-protons measured by the Telstar® 1 satellite, nor are they suitable for 
fitting many of the electron distributions. Preliminary investigations 
indicate that these remarks may not apply to G’’”, whose additional 
coefficients allow more rapid changes in curvature as a function of a. 





(15) 
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We have already shown for Telstar® 2 data® that A(L) can be ex- 
tended to include description of the plateau of high-energy protons 
reported by McIlwain?® *? near the equator at R ~ 2.2 R,, beyond the 
orbital extremes of the Telstar® 1 satellite. The extension was made by 
adding a term to A’(Z), (6), to give A” defined by 


2 
A® = AML) +a, exp| —E= 20), (16) 
where a4, as, and L, are coefficients describing the equatorial distribu- 
tion of the “excess” protons that give rise to the plateau. In the less 
stable parts of the radiation belts the early work on empirical time 
dependence presented by Gabbe and Brown! clearly requires extension. 


V. FITS ON THE L-SLICES 


The model of (1) and (2) was fitted to the data, on the scale of Y, 
in 92 individual L-slices, using a nonlinear, multidimensional, least 
squares, computer program (see Appendix B) to estimate the coeffi- 
cients and produce various statistical measures. The procedure of fit- 
ting to L-slice data enabled one to test functional forms of yz,(x) and 
then to evolve functional forms for the dependencies of the coefficients 
of the L-slice models on L. 

Proceeding in this manner, however, has a number of possible pit- 
falls. In particular, the estimates of coefficients within an L-slice may 
be highly correlated, and the reliability of the actual values of the 
estimated coefficients also depends on the pattern of data points in 
the particular L-slice, e.g., whether or not there are points near 2p. 
Hence, the estimated values for any particular coefficient may not ex- 
hibit a smooth dependence on L. 

The form of the L-slices whose middle values of L, called L,,, are 
1.35, 1.801, 2.2015, and 1.79, respectively, are displayed in Figs. 4 to 7. 
The thin solid lines in the figures are the fits to the L-slice data (mean- 
ing of the dashed and thick solid lines will be taken up later). The 
numerical values of the coefficients of the fits, and the widths of the 
slices are given in Table II. Figs. 4 and 5 are examples of the high 
quality of fit which is typically obtained for L-slices having L» < 2. 

In Figs. 4(a) and 5(a), square root of counting rate is plotted 
against x. One sees that the fit to the data points (the thin solid line) 
is quite adequate. The cutoffs, x, are well-defined, the scatter in Y is 
approximately independent of y and the data are well-distributed in z. 
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Lm = 1.350 


FIT TO POINTS 
——— FROM CB COEFFICIENTS 
FROM HTB COEFFICIENTS 


Lm=1.350 


FIT TO POINTS 
——— FROM CB COEFFICIENTS 
FROM HTB COEFFICIENTS 





6) 0.2 0.4 0.6 0.8 
x 


Fig. 4— Data from the L-slice centered at Im = 1.35 and the results of three 
fits shown on four scales. 
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Fig. 4— (continued) 
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@-REGION 1 














Lyry=1.801 


——— FIT TO POINTS 
——— FROM CB COEFFICIENTS 
— FROM HTB COEFFICIENTS 
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Fig. 5— Data from the L-slice centered at Lm = 1.801 and the results of three 
fits shown on four scales. The partitioning in (a) is discussed in Section 7.1. 
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Fig. 5 — (continued) 
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TasBLE II— CoEFFICIENTS AND STATISTICS OF THE L-SuIcE Fits. 





Lim 1.35 1.801 2.2015 1.79 
Iu 1.346 1.800 2.200 1.7895 
Iuax 1.354 1.802 2.203 1.7905 
AL 0.008 0.002 0.003 0.001 
A 6.757 4.109 1.70 4.324 
a(A) 0.053 0.031 0.12 0.043 
Le 0.6795 0.8998 0.954 0.923 
o(2e) 0.0027 0.0044 0.011 0.015 
S 0.324 0.390 0.58 0.478 
a(S) 0.018 0.024 0.10 0.060 
Number of pts 140 129 144 65 
MSE 0.1125 0.0497 0.0282 0.0478 
Correlation coefficients 
A with zx. 0.281 0.309 0.724 0.408 
A with S 0.605 0.561 0.940 0.548 
ze with S 0.774 0.820 0.890 0.944 








As the cutoff is sharp on the scale of y, it is convenient to have a 
function which has an infinite derivative at x,. Otherwise the exact 
x at which y > 0 may have relatively little effect on the mean square 
error of the fit. This would lead to an ill-defined value for x,, even 
though the data allows one to evaluate the position of the cutoff quite 
precisely for Z values smaller than =1.9. 

In Figs. 4(b) and 5(b), the counting rate, Y?, is plotted against x. 
The thin solid lines represent the same fits as those in Figs. 4(a) and 
5(a). One finds that the position of the cutoff is no longer well-defined 
on the plot. Instead the counting rate fades away as x increases. Hav- 
ing the derivative of y? equal zero at the cutoff (as noted in the pre- 
vious section) is suitable in this situation. The scatter in Y? now 
changes with y?, and is greater for large values of y? (small values 
of x). This nonuniform scatter makes it more difficult to judge the ap- 
propriateness of fit. If one wished to minimize the squared deviations 
between observed and fitted in terms of y? (or log y?) the values of 
Y? (or log Y?) would have to be weighted inversely as their estimated 
approximate variance, with a loss of intuitive appreciation of the qual- 
ity of fit from a scatter plot and a substantial inconvenience in carry- 
ing out the fitting procedure. 

In Figs. 4(c) and 5(c) the ordinate is log y?. This choice of coordi- 
nate restores the ability to discriminate in the vicinity of the cutoff at 
the cost of a large loss of sensitivity in regions where the counting rate 
is higher. 

Finally, Figs. 4(d) and 5(d) display the same data in the coordinate 
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system log y*, log (B/Bo). This choice of abscissa expands the high-x 
region enormously, but contracts the low-x region to the point where 
it is impossible to see the details of the particle distribution in the 
vicinity of the equator (x = 0). This contraction would be even more 
severe if the abscissa were B or B/Bo. 

In the region defined by ) < 45°, which covers the high energy pro- 
ton data, the coordinate x provides adequate detail (see Ref. 10 for 
further discussion). If, however, the data had extended to A > 45° an- 
other choice of magnetic coordinate would have been desirable for 
x > 0.95, because all A > 45° are crowded into x values between 0.95 
and 1. 

The standard errors and correlations of the coefficients of the 
four L-slices under discussion, together with mean square error (MSE) * 
of fits, are listed in Table II. The standard error is in general a 
relatively small fraction of the estimate and the MSE is substantially 
greater at small values of L,, than at larger ones. This is further ana- 
lyzed in Section VI. 

At L = 2.2 the satellite gets no closer to the magnetic dipole equator 
than A = 20°. This fact, which is associated with the problem of cor- 
relation of coefficient estimates within L-slices, is displayed more em- 
phatically by choosing x as a coordinate, as in Figs. 6(a), (b), and (c), 
than by choosing log (B/Bo) as in Fig. 6(d). In addition, in Fig. 6(d) 
the expansion of the abscissa in the region of the cutoff makes it diffi- 
cult to judge the physical appropriateness of the value of x, which re- 
sults from the least squares procedure. The same difficulty is encoun- 
tered to a lesser degree with Fig. 6(b). However, in Figs. 6(a) and 
6(c) one judges the x-intercept of the thin solid line to be too large, 
and Fig. 6(a) has the additional advantage of allowing one to make 
a better judgment of the quality of the fit at lower values of x. As might 
be surmised from the high values of the correlations for L,, = 2.2 in 
Table II, the value of x, can be adjusted to a substantial extent with- 
out much change in the mean square error. These high correlations, 
which typically occur for LZ, > 2, reduce confidence in the individual 
estimates of the coefficients for given L-slices. This difficulty also re- 
duces the stability of the estimates of the coefficients as L,, is changed, 
and precludes basing the values of x,(L) and S(L), for L > 2, on the 
fits to the Z-slices. 

A similar difficulty may be introduced when L < 2 by sampling 
fluctuations as illustrated in Fig. 7. In this case, there is a scarcity of 


* Some statistical terms are defined in Table I. 
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Fig. 6— Data from the L-slice centered at Im = 2.202 and the results of three 
fits shown on four scales. 
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Fig. 6 — (continued) 
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Fig. 7— Data from the L-slice centered at Lm = 1.790 and the results of three 
fits shown on four scales. 
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data near and beyond the cutoff, unlike the slice with L,, = 1.801 
illustrated in Fig. 5. The paucity of data near the cutoff in the L-slice 
centered on L,, = 1.79 both correlates and distorts the values of x, 
and S. In this particular case, the width of the L-slice can be increased 
to avoid this difficulty, but, in general, increasing the width of the 
slice to include enough data may introduce a serious L-dependence 
within the slice. As a result, x, may be determined by points near one 
extreme of Z within the slice, A by points at the other extreme and S 
by some combination. This problem is especially severe below L = 1.3 
where data begin to become sparse. 

The plotted points in Figs. 8 to 10 summarize the dependencies of 
the estimates of the L-slice coefficients A, x,, and S, respectively, on 
Lim, for all 92 slices. More than one value of the coefficients is plotted 
for some values of L,, because on occasion the width of the L-slice was 
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Fig. 8— Three estimates of A as a function of L. 
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Fig. 9— Three estimates of x, as a function of L. 


varied without changing L,,. Although there are local fluctuations in 
the estimates that arise from the way a narrow L-slice samples the 
data, the estimates exhibit a smooth dependence on L. The fluctuations 
are particularly pronounced near L,, = 1.8 in Figs. 9 and 10, and L,, = 
1.3 in Fig. 10. 

The standard errors of the L-slice estimates of A are typically 1 per- 
cent for L < 1.95, but become as large as 6 percent where there are 
no equatorial data, as is the case for L > 1.95. Fox x, estimates, the 
standard errors are typically 0.5 percent. The estimates of S have a 
standard error of about 5 percent (+-0.015) near LZ = 1.5 and about 
15 percent (+0.05) near L = 1.2 and L = 2. The meanings of the 
curves in Figs. 8 to 10 will be discussed in the following sections. 
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Fig. 10 — Three estimates of S as a function of L. 


In summary, the L-slice approach enables one to infer a functional 
dependence of L-slice coefficients on L and to obtain an intuitive ap- 
preciation of the quality and nature of fit. The fitting procedure re- 
quires refinement by being carried out as a simultaneous two-dimen- 
sional process in x and L jointly. This overcomes the “grouping” 
inaccuracy in the L-slice approach and in addition makes good use of 
the data in those regions where data are scarce. The resultant function 
also provides convenient and excellent interpolation of data over the 
entire z,L region while employing a relatively small number (8, 9, or 
10) of fitted coefficients. 


VI. THE TWO-DIMENSIONAL FIT FOR THE COMPLETE BODY OF DATA 


The analysis of this section is a precursor to the more refined paral- 
lel analysis of Section VII. This preliminary analysis produces the 
following results of consequence: Model I (see Section 4.4) is shown 
to be satisfactory; instrumental effects are identified and an objective 
algorithm for partitioning the data to reduce these effects is formu- 
lated; outliers are screened; and a more adequate basis for sample 
selection is provided. Many statistical details are omitted from this 
section, and statistical matters are dealt with more fully in Sections 
VII, VIII, and IX and in Appendices B and C. 


6.1 Sample Selection and Fit 


It was necessary, for practical computing reasons, to make a selec- 
tion of approximately 1000 observations on which to carry out the 
simultaneous two-dimensional (in x and Z) nonlinear (in the coeffi- 
cients) least squares fit. In this preliminary phase, the nearly 80,000 
data points were sampled by dividing the L-range from 1.15 to 3.00 
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into 925 contiguous intervals, each 0.002 wide. One data point was 
selected from each interval. As the data are approximately uniformly 
distributed in x (in the z-range covered by the satellite) in each 
L-slice (see Figs. 4 to 7), no effort was made at this point to in- 
fluence the x distribution of the observations in this subset. The ques- 
tion of the “design” of the sample to be used as a basis for fitting the 
model is rather important, however, since the fit obtained with the 
empirical model is responsive to the distribution of data in 2,L space. — 
Other bases of sampling were employed later (see Section 7.1 and Ap- 
pendix B.3). 

Model I, described in Section 4.4, was fitted to the 925-point sample 
from the complete body (CB) of data. As this serves only as a pre- 
liminary fit, the values of the CB coefficients and other statistics are 
not presented here. 

The quality of this fit was examined from various viewpoints: (t) 
by its behavior along the boundaries of the belt; (72) by comparison 
with the L-slice fits; (2) by plotting the residuals (observed value 
minus fitted value) versus the x and Z coordinates; and (iv) by ex- 
amining the mean square residuals (MSR) in various regions of mag- 
netic coordinate space. Though the coefficients of the model were esti- 
mated from 925 sampled data points, the evaluation of quality of fit 
was based on all the nearly 80,000 observations. 


6.2 Hvaluation of Fit at Equator 


The points in Fig. 11 are the values of Y (square root of observed 
counting rate) plotted against L for all data points for which z is near 
0, specifically « < 0.037 (i.e.,A4 < 1°). For a given ZL, y’(x, L) changes 
very little between x = 0 and x = 0.087 (see Figs. 4 and 5) and the 
points in Fig. 11 may be regarded as approximate equatorial points. 
The curve in Fig. 11 gives the fitted values of A’(Z) = y’ (0, Z) using 
the CB coefficients, and appears to represent the data very well. Note 
that A’(Z) has not come from a fit to the equatorial data as such, but 
rather is the equatorial value of y’ as predicted by the two-dimensional 
fit. That is, the fitted A’(Z) does not minimize the sums of squares of 
deviations for just the equatorial points, but is, rather, the optimum 
fit in the least squares sense to the 925-observation sample, and these 
observations are distributed through 2,L space. The excessive scatter 
in the equatorial value of Y between L = 1.85 and LZ = 1.55 which 
shows in Fig. 11 will be taken up in the next section. 

The values of A’(Z) are also plotted for reference as the dashed 
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Fig. 11— All data for « < 0.037 (ie., within 1° of the magnetic invariant 
equator) and the equatorial value estimated from the CB coefficients plotted 
against L. A’ and Y are in units of (counts/sec)”?. 


line in Fig. 8. One sees that the L-slices give quite good estimates for 
A, although these estimates tend to be a little erratic and to favor 
the lower values rather too much in the neighborhood of L = 1.4. 


6.3 Evaluation of Fit at Cutoff 


The cutoff may be thought of as the position of the outer envelope of 
the nonzero counting rate, or the inner envelope of the zero counting 
rate. Thus, in practice the location of the cutoff is associated with the 
sensitivity of the detector, rather than with the absence of particles. 
For L = 2, there is a wide range of x over which there are many in- 
stances of either zero or one count occurring during the 11-second count- 
ing interval, and as a result the cutoff is not well-defined. This is 
exemplified in Fig. 6. The overlapping of the region in which no count is 
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observed with that in which one count is observed shows clearly in 
Fig. 12. The locations of occurrences of zero counts are plotted in R,A 
coordinates in Fig. 12(b) and in 2,Z coordinates in Fig. 12(d). Figs. 
12(a) and (c) show the locations at which one count (one, two, and three 
counts for L < 1.5) was recorded. (The density of points has been re- 
duced at high LZ to improve the clarity of the display.) 

Because the cutoff is increasingly difficult to define from the data as 
L increases beyond =2, the position of the cutoff predicted by the fitted 
model is not a good boundary condition to use in judging the quality of 
the two-dimensional fit. Instead the locus of positions for which exactly 
one count per counting interval is predicted is superimposed as the solid 
lines in Figs. 12(a) and (c) upon the array of points giving the band 
of positions at which one count per counting interval was observed. The 
data are represented quite satisfactorily by the solid lines particularly in 
the region (LZ S$ 1.90) where the belt ends abruptly. The fit is least 
satisfactory near L = 2 (\ = 40°). Adding the terms 7,4(Z — L,)* and 
r5(L — Ly)* to the expansion for R,(L) in (5) does not appreciably 
improve the fit near \ = 40°. 

The line z,(L), representing the cutoff itself, is plotted as the dashed 
line in Fig. 12 and is seen to be a reasonable outer envelope for the 
nonzero counts. 

The present estimate of x,(L) is also shown as the dashed line in 
Fig. 9. Below L = 1.8, the estimates of x, from the individual L-slices 
are In good agreement with estimates from the two-dimensional fit. 
However, above L = 1.8 the L-slices give erratic values for x,. As 
demonstrated in Fig. 7, the L-slice estimates may be biased toward 
high values, a circumstance which makes it difficult to extract a satis- 
factory fit for x,(L) from the estimates of x, produced by fitting the 
L-slices. 


6.4 Behavior of S(L) 


The values of the function S(Z) generated by the two-dimensional 
fit cannot be subjected to a simple boundary comparison with the data. 
The function S(L) is plotted as the dashed line in Fig. 10 along with 
the L-slice estimates. It will be seen that the L-slice estimates tend to be 
somewhat higher than the values given by S(L) in the neighborhoods 
of L = 13 and L = 1.9. However, if the form of S(ZL) is taken to 
provide a better fit to the points in Fig. 10, then the resulting two- 
dimensional fit yields a physically less satisfactory fit of the cutoff 
function z,(Z) to the boundary data without substantial improve- 
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L 


Fig. 12— All positions in R, space (a) and z,L space (c) at which one count 
(one, two, and three counts for L < 1.5) was observed in an 11-second counting 
interval, and all positions in R,A (b) and 2z,L space (d) at which zero counts 
were observed in an 1l-second counting interval. The solid lines are the loci of 
positions at which the CB coefficients estimate one count in 11 seconds. The 
dashed lines are the loci of the cutoff function z-(L) or R-(L) calculated from 
the CB coefficients. The trace R = 2.0 R., which explains the absence of data 
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Fig. 12 — (continued) 


in the lower right-hand corner of the z,Z plots, appears in part (d). The cluster 
of points near R = 1.1 and } = 20° in part (b) of the figure is data acquired by 
the telemetry station at Woomera, Australia. It represents observations made 
near perigee when the satellite was below the bottom edge of the proton belt, 
which is high over the western Pacific Ocean. 
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ment in the overall fit (see also Section 4.7). Admittedly, this judg- 
ment is subjective because it is made in regard to regions where the 
cutoff is poorly defined by the data because of the insufficent sensi- 
tivity of the detector. The high values of S near ZL = 1.9 appear to 
arise from the correlation problem discussed in Section V in connection 
with Fig. 6 and Table IT. 


6.5 Behavior of the Fit on Several L Slices 


The dashed lines in Figs. 4 to 7 are the values predicted by the CB 
coefficients superimposed on the L-slice data along with the pre- 
viously derived L-slice fit. In Figs. 4 and 5, the difference between the 
thin solid and the dashed lines is insignificant, and this is generally 
the case for L < 1.95. At Lm = 1.79, the predictions from the CB 
coefficients differ importantly from the fit to the L-slice only for x 
values at which there are no data. 

For Lm = 2.2, however, the two predictions are noticeably different 
as may be seen in Fig. 6. The fit to the L-slice gives the estimate 
L_ = 0.954 (see Table II); the two-dimensional fit yields x, = 0.928; 
and the difference exceeds two standard deviations. The question as to 
which of the two lines is a better representation of the data in this 
L-slice in the physical sense, rather than in the least squares sense 
applied to these points by themselves, is connected with criteria 
which will be discussed in the following sections. The basic fact is 
that the two-dimensional fit provides a mechanism by which the data 
on every L-slice can influence the fit on every other L-slice and 
thereby provides a fit that is more satisfactory overall than the 
collection of individual L-slice fits. 


6.6 Residuals in x,L Space 


The data were also examined for dependencies on x and ZL over 
and above those provided for by the fitted mathematical model. This 
is accomplished by studying the residuals, i.e, (VY — y), for all the 
nearly 80,000 observations. The residuals provide a very sensitive basis 
for judging the quality of the fit. The removal of the principal depen- 
dence on x and L by subtracting the fitted function from the observa- 
tions has the effect of allowing small systematic differences to be 
prominently displayed. 

Fig. 18 shows a 3100-point sample of the residuals, Y — y, plotted 
against L, where, to keep the density of the points reasonable, only 
one point has been plotted from each of the nearly 3100 contiguous 
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Fig. 183—CB residuals of Y (i.e., Y — y calculated from the CB coefficients) plotted 
against L. The arrows indicate + the approximate standard deviation if Y? were 
Poisson distributed. No more than one point is plotted for an Z increment of 0.0006. 


L-intervals, of width AZ = 0.0006, between LZ = 1.15 and L = 8. 
Ideally, the residuals should scatter randomly about 0, without any 
perceivable pattern. For L < 2.4 there is only a little indication of a 
nonrandom trend. However, for LZ > 2.4 there is a distinct pattern. 
This pattern is associated with the quantization error, which becomes 
important where the number of counts per counting interval is very 
small. When0 < y < V1 count/11 sec and Y = 0 or V1 count/11 sec, 
the result is the tailing upward toward the residual = 0 axis that starts 
at L x 2.4, When y = 0 and Y = 0 or V1 count/11 sec, one gets 
the two-line pattern (0 and 0.0310 = 1/11) seen clearly in Fig. 13 
for L = 2.7. (The thickening of the zero axis indicates the presence 
of data points.) 

Fig. 14 is a plot of the residuals against x for all points for which 
14 < LZ < 1.6. The residuals in Fig. 14 show no structure; however, 
their average value is a little less than zero. This dip is confirmed by 
the points in the range 1.4 < LZ < 1.6 in Fig. 13, and means that the 
value of y is slightly high relative to the data in this region. However, 
the lack of structure in Fig. 14 indicates that the bias is independent 
of a in this region. 

Fig. 15, the plot of the residuals vs x for 1.85 < LZ < 1.90, shows 
the region in which the fit is poorest. The residual points are not sym- 
metrically distributed about zero and the asymmetry seems to depend 
on «. Notice that the value of y is slightly too large near x ~ 0.05 and 
x & 0.65. The discussion of these trends is continued below, after 
some further analysis has been described. 


6.7 Mean Square Residuals in x,L Space 

Another way of gauging the quality of fit is to compute the mean 
square of the residuals (MSR) separately for various regions of 
x,L space. Trends in these quantities may indicate regional varia- 
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Fig. 14— CB residuals of Y (i.e., Y— y calculated from the CB coefficients) plotted 
against x for 1.40 < L < 1.60. The arrows indicate + the approximate standard 
deviation if Y? were Poisson distributed. 


tions in the adequacy of fit. The data and residuals were divided into 
three groups. Group I contains all the “good” data points “within” 
the boundaries of the > 50 MeV proton belt. These points are defined 
as those not included in Groups II and III. Group II consists of the 
“sood” data points “outside” the boundaries of the belt. These are 
points which meet two criteria: they have values of (x, LZ) for which 
x is greater than x,(Z) + 0.001, and they are not in Group III. Group 
III comprises the outliers or “bad” data points, defined as those points 
whose residuals are greater than three times the overall root mean 
square residual of the points in all three groups together.* The most 
probable origin of a point in Group III is a telemetry error. 

If the number of counts in a counting interval behaves like a 


* Note that only 0.5 percent of the data fall in Group III. 
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Poisson random variable, then the variance of Y? would be equal to 
the average value of Y?. As noted in Appendix B, when Y is not near 
zero, the variance of Y would then approximately equal 0.023, inde- 
pendent of the average value of Y. This value then might approx- 
imately represent the average value of the mean square residual, 
MSR, on the scale of Y. Thus, the number 0.023 provides a baseline 
for the comparisons discussed below. 

Table III lists the mean square residuals (MSR) by L range and 
by Group. For Group II, Y is frequently zero and, as x > 2, implies 
y = 0, one finds that the residual is zero very often. Of course, under 
the Poisson assumption the variance of Y when its average value is 0 
or very close to 0 will be less than 0.023 (see Appendix B.2) and the 
appearance of MSR values smaller than 0.023 in Group II is thus not 
surprising. A similar circumstance exists in Group I for L > 2.6. 
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Fig. 15 — CB residuals of Y (i.e., Y— y calculated from the CB coefficients) plotted 
against x for 1.85 < LZ < 1.90. The arrows indicate + the approximate standard 
deviation if Y? were Poisson distributed. 
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All data. CB coefficients 


L-Range Group I Group IT Group III 
Lyin Lynx No. of points MSR No. of points MSR No. of points MSR 
alee | 1.2 148 0.039 3l 0.009 0 0 
1.2 1.3 1147 0.053 68 0.019 9 4.171 
1.3 1.4 1608 0.106 99 0.027 20 2.265 
1.4 1.5 1939 0.106 120 0.010 19 6.743 
1.5 1.6 2974 0.083 101 0.004 22 9.079 
1.6 1.7 3835 0.056 104 0.004 26 4.617 
1.7 1.8 5233 0.055 87 0.001 29 4.356 
1.8 1.9 8487 0.054 92 0.001 54 4.110 
£9 2.0 8880 0.041 98 0.011 55 4.280 
2.0 2.1 6261 0.043 106 0.033 24 8.031 
2.1 2.2 5354 0.032 183 0.049 18 6.509 
2.2 2.3 A717 0.030 313 0.047 16 6.982 
2.3 2.4 4040 0.034 477 0.030 21 9.478 
2.4 2.5 3769 0.044 716 0.021 22 8.296 
2.5 2.6 2987 0.038 1000 0 014 15 6.462 
2.6 2.7 2066 0.023 1696 0.010 15 17.098 
vas 2.8 225 0.011 3104 0.007 24 13.343 
2.8 2.9 0 0.0 2784 0.006 11 14,545 
2.9 3.0 0 0.0 2394 0.005 6 15.908 
1.1 3.0 63670 0.048 13573 0.011 406 7.011 
1.1 3.0 925 0.045 MSE of CB Seis (Group I + Group IT) 
1.1 2.0 975 0.065 MSR of equatorial points, \ < 1° (Group I) 


SPEL 


L961 UAAWALdMAS “IVNUNOL 'IVOINHOUL WALSAS TIAd GHD 


HTB data. Model I coefficients (see Table IV) 


L-Range Group I Group II Group III 
Ly Iyrax No. of points MSR No. points of MSR No. of points MSR 
1.1 1.2 111 0.037 28 0.010 0 0. 
1.2 1.3 650 0.045 49 0.028 8 4.435 
1.3 1.4 633 0.059 78 0.043 6 1,253 
1.4 1.5 693 0.050 56 0.0 7 5.892 
1.5 1.6 926 0.039 43 0.019 I 0.816 
1.6 1.7 1342 0.036 38 0.002 6 5.472 
1.7 1.8 2161 0.037 39 0.005 8 5.474 
1.8 1.9 4708 0.037 30 0.003 38 4.981 
1.9 2.0 5585 0.046 28 0.013 40 4.184 
2.0 2.1 3728 0.049 38 0.021 16 9.037 
2.1 2.2 3258 0.033 38 0.036 10 9.716 
2.2 2.3 2857 0.030 80 0.034 10 6.581 
2.3 2.4 2335 0.032 135 0.027 14 9.094 
2.4 2.5 2193 0.043 212 0.020 11 8.922 
2.5 2.6 1831 0.041 278 0.011 11 6.638 
2.6 2.7 1520 0.027 464 0.007 9 17.120 
yy | 2.8 1083 0.014 765 0.007 18 15.845 
2.8 2.9 146 0.009 1433 0.007 9 16.145 
2.9 3.0 0 0.0 1317 0.005 4 20.590 
1.1 3.0 35760 0.038 5149 0.009 226 7.926 
1.1 3.0 960 0.036 MSE of 960-point HTB Sample (Group I + Group IT) 
1.1 2.0 429 0.035 MSR of equatorial points, \ < 1° (Group I) 


Poisson approximation: variance ~ 0.023. Note conditions in text and Appendix B.2. 
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For the overall fit, the MSR of Group I (LZ range from 1.1 to 3.0) is 
only twice 0.023. However, for 1.8 < L < 1.6 the Group-I MSR is four 
times 0.023. This Z range is associated with the large scatter in the 
equatorial data plotted in Fig. 11, and Fig. 14 shows that this scatter is 
independent of x, rather than just an equatorial phenomenon. This 
issue is pursued further below. 


6.8 Dependence of Residuals on Other Variables 


Studies were made of the possible dependence of the residuals on 
observed variables other than x and L. Indeed, it will appear that 
some of the excess scatter exhibited in Table III and in Figs. 11 and 
14 is associated with instrumental effects. 

The regularities inherent in the orbit and orientation of a satel- 
lite, the motion of the earth, and the location and operation of the 
telemetry receiving stations lead to systematic interrelations among 
the various coordinates listed in Table I. A simple example concerns 
temperature. The satellite cools when its enters the earth’s shadow. 
This eclipse occurs only on the night side of the earth. Thus, if the 
detector is temperature sensitive, one would see a false day-night 
effect in the counting rate. If, because of additional dependencies, 
data are available during eclipse for only a limited span of days, a 
false secular effect might also be observed. Because of the implications 
of the preceding discussion, a careful study was made of the behavior 
of the residuals with respect to a large number of coordinates, and 
attention was given to the details of the relationships among the 
coordinates during the search for contributors to the inflation of the 
MSR. 

We present below the evidence that has led us to the conclusion that 
two instrumental effects, variations in bias voltage and changes in 
temperature of the detector, are principal causes of inflation of the 
MSR. 

There was no temperature sensor on the particle detector. The 
instrument is not exposed to sunlight and is relatively well-insulated 
thermally from the skin and frame of the satellite. Consequently, 
temperature measurements of the skin are not closely related to the 
temperature of the detector. However, a good indicator of detector 
temperature is elapsed time since entering or since leaving eclipse. 
Fig. 16 gives plots of the residuals, Y — y, against time in minutes 
measured from the more recent of the two events, entered shadow or 
entered sunlight. Residuals associated with periods during which the 
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satellite did not enter eclipse once per orbit are segregated at the far 
right-hand side of the plots, labeled A on the abscissa. 

Figs. 16(a) and (b) are for 1.4 < L < 1.6. The points in Fig. 16(a) 
are those for which the bias voltage was between 95.3 and 97.5 volts, 
while Fig. 16(b) contains those for bias voltages between 92.0 and 
95.3 volts. The decrease in the residuals (and also in the observed count- 
ing rate) after the satellite enters eclipse (and the temperature falls) 
and the increase after the satellite leaves eclipse (and the temperature 
rises) may be seen distinctly in both figures. In addition the residuals are 
noticeably more negative for the low (92.0 to 95.3 V) bias range. Both 
low bias voltage and low temperature are known to decrease the ef- 
ficiency of the detector and one expects an appreciable effect to be intro- 
duced into the counting-rate data. In the present case the scatter is 
about +15 percent of the counting rate. A consequence of this is the 
excess scatter that has been noted particularly with reference to Fig. 11 
and Table IIT. 

Figs. 16(c) and (d) are analogous to Figs. 16(a) and (b), but the 
residuals are for the L range 1.85 to 1.90..Again, the systematic 
influence of low temperatures and low bias voltages is unmistakable. 


6.9 Partitioning the Data 


Two ways of responding to these instrumental effects might be: 
(i) to try to correct the data, or (i?) to disregard the affected data. 
It is not possible to make a correction to the counting rate that is 
properly independent of the experimental results because; (2) the bias 
voltage was measured in steps of 1.11 V, which is not sufficiently fine- 
grained; (ti) it would be necessary to estimate the temperature of 
the instrument using a complicated hypothetical relationship between 
the instrumental temperature, skin temperature, and time after enter- 
ing eclipse (or sunlight); and (77) we have an insufficient knowledge 
of the temperature and bias-voltage sensitivity of the detector. 

Though an ad hoc correction based on the observed counting rates 
could have been attempted, it was decided for practical reasons to 
eliminate both the low-temperature and low-bias points and use only 
that data which was gathered under the following conditions: 


(t) The satellite had been in sunlight for the previous 50 minutes, 
and thus had attained temperature equilibrium reasonably well 
(see Fig. 16). 

(22) The bias voltage was between 95.3 and 97.5 volts. 
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Fig. 16— CB residuals of Y (i.e., Y — y calculated from the CB coefficients) plotted 
against time in minutes from the most recent of the two events, entered eclipse 
or entered sunlight. Data taken on days during which no eclipse occurred are plotted 
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Fig. 16 — (continued) 


within the region marked ‘“‘A” at arbitrary values of the abscissa. The arrows 
indicate + the approximate standard deviation if Y? were Poisson distributed. 
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This selection yields a homogeneous body of 41,135 points, hence- 
forth referred to as high temperature-high bias (HTB) observations. 
The remaining 36,500 points, which represent a mixture of tempera- 
ture and bias conditions, were used only occasionally in further 
analyses. This selection process coincidentally produces one unfort- 
unate associated circumstance, namely, the exclusion, as low-bias 
data, of all measurements made between days 325 and 373. 

Further analysis and model fitting and development based on, and 
directed towards, this HTB data is detailed in the following sections 
and Appendix C. 


VII. THE TWO-DIMENSIONAL FIT FOR THE SELECTED (HTB) DATA 


7.1 Sample Selection 


The distribution of the HTB data in magnetic space is indicated 
in Fig. 17, which gives the R,A coordinates of every tenth point from 
the 41,185 L-ordered HTB observations. The data provide reasonably 
adequate, though uneven, coverage. As a practical requirement for the 
fitting procedure, a “representative” sample of about 1000 observa- 
tions must be selected. 





MAGNETIC INVARIANT EQUATOR 





Fig. 17 — The spatial distribution of the HTB data for L < 3 in R,\) coordi- 
nates. Every tenth point from the L-ordered data is plotted, 
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It is intuitively clear from preliminary knowledge of the radiation 
distribution that some sample configurations will be far more effective 
than others in defining the functional form of the proton flux. 

The sample selection is important because: (7) nothing more than 
a sophisticated smoothing function is being fitted and we want this 
function to be broadly applicable over the entire space; (i) an 
optimum fit in one region of space does not necessarily imply a good 
fit elsewhere; (wz) the spatial distribution of data points depends on 
the satellite orbit and the position of the telemetry stations; (7v) even 
with the square root transformation, there remains some differential 
variance among the data. 

These considerations argue against using a simple random sample 
or even a random sample in x with a systematic sample in L such 
as in the CB fit. Indeed, they also argue against fitting all (un- 
weighted) HTB data, even if this were practical. Alternatively, points 
might be chosen on the basis of a simple geometric grid in magnetic 
space. Such a procedure would be easy to use, but it is arbitrary with 
respect to the radiation belts. 

Sampling procedures might be based on particular physical features 
of the radiation belts to emphasize the goodness of fit, for example, 
where the flux is high or where diffusion across L lines might be 
important. However, such fits would be too biased for our present 
general objective. 

One is thus led to a sampling process based on properties of the 
radiation belt itself, as described for example by the preliminary CB 
fit. In particular, a high density of data points is desirable in regions 
where the value of y is changing rapidly, while a low density will 
suffice where the function is changing slowly. A realization of this 
criterion would be to define about 1000 2,Z cells, within each of 
which the range of y from the preliminary fit would be the same. 
However, there are appreciable practical difficulties in defining the 
boundaries of such cells. 

Thus, the following hybrid procedure was used to define the 960- 
point HTB sample on which the subsequent fitting was done: The 
L-range from 1 to 3 was divided into about 120 L-slices of equal 
(=~ 0.017) width in L. Each L-slice was then divided into eleven 
x,L cells using a scheme that depends on the preliminary fit. The 
first ten cells were chosen so that within each cell the range of y 
predicted by the CB model is closely 1/10 of the equatorial value of 
y at the center of the L-slice. The eleventh cell hes beyond 2. The 
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method of partitioning in the x direction is illustrated by the partition 
of the L-slice in Fig. 5(a) into five x-regions by the horizontal 
lines. (The distance d is added to x, to define the lower-x boundary of 
the last cell.) 

To take some account of differential variances remaining after 
the square root transformation, the following procedure was em- 
ployed: The mean square deviation from the mean (MSD*) was 
calculated for all the HTB data in each 2,L cell defined above; 
thence, after visual inspection of the results (see Appendix C), three 
groupings of contiguous 2,L cells were made according to whether the 
MSD’s were generally below 0.013, between 0.013 and 0.020, or above 
0.020; the corresponding regions were then given relative weights of 
2, 14, and 1, respectively. The weight 1 implies that one point was 
sampled from the cell. 

These weights were assigned on the basis of a judgment which con- 
sidered: (2) the desire to increase the weight of low variance (i.e., 
near-zero counting rate) observations and thus to aid the definition of 
the cutoff; and (22) the desire to keep from “wasting” sample points 
in the region x > x, since such data will add little to the specification 
of x-(L) and virtually nothing to the estimation of A(L) and S. 

Fig. 18 shows the distribution in z,L space of the 960-point sample 
which was used. The number 960 came about because a number of 
the defined cells had no data in them. Our experience with several 
other samples of the HTB data gives us confidence in both the ration- 
ale behind, and the results obtained with, this 960-point set, henceforth 
referred to as the HTB sample. However, sampling procedures tailored 
to the requirements of special purpose fits will give better results in 
some regions of x,L space. 

- Some additional discussions relevant to sample selection and data 
usage are given in Section 13.3 and Appendices B.3 and C.2. 


7.2 The HTB Fit 


A slightly constrained version of Model I of Section 4.4 was fitted 
to the 960-point HTB sample. The results are referred to as the HTB 
fit. The constraint is s; = 0, in (3). Most of the values of s, obtained 
in preliminary fits to various samples of the HTB data differed from 
zero by less than two standard deviations. Also, the points in Fig. 





* See Table I for definition of MSD, MSR and MSE. 
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Fig. 18-—— The distribution of the 960-point HTB sample in 2, space. The 
ae R = 2.0 R, explains the absence of data in the lower right-hand corner of 
the figure. 


10 do not suggest a linear dependence of S on L.* The effect of this 
constraint on the value of the fitted cutoff function was examined and 
found to be unimportant. 

The estimated HTB coefficients (obtained by fitting the constrained 
model to the HTB sample) appear in Table IV. The physical inter- 
pretation of I as the lowest L on which > 50 MeV protons were 
measurable was noted in Section 4.3. The standard error of 0.001 (6 
km in altitude) is no larger than the uncertainties inherent in the 
calculation of Z itself. 

The interpretation of S as a shape factor (see Section 4.2) 
straightforward in the present case, i.e., where s; = 0. The standard 
error of 0.005 is much smaller than the standard errors of the 
estimates of S generated from the fits to L-slices (Table II) and is 


* Some higher-order models for S(L) were tried but proved unsatisfactory (see 
also Sections 6.4 and 9,2), 
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(a) 
HTB 


0.5 





10] 0.2 0.4 0.6 0.8 4.0 
x 


Vig. 19 — Graphical summary of the HTB fit, (a) curves of y’ vs L for constant 
x, (b) curves of y’ vs x for constant L, (c) contours of constant y’ in 2,L space. 
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also small compared to the scatter in Fig. 10. This implies that a 
substantial fraction of the scatter may be associated with the high 
correlation between S and 2, on the L-slice fits. Further consideration 
of standard errors and correlations of the fitted coefficients and 
detailed statistical evaluation of the fit is deferred to Section VIII. 

Fig. 19 presents a graphical summary of the function y’(x, L). 
Part (a) of the figure shows y’ vs L for (several) constant x. Physi- 
cally, these curves correspond to values of the intensity of radiation 
vs L for constant magnetic dipole latitude, because « = constant 
implies 4 = constant. The nesting of the curves in Fig. 19(a) is a 
consequence of the fact that G’(x; z,, S) decreases monotonically 
with « [see (2) and Fig. 19(b)]. The shape of the curves changes 
smoothly with LZ, and the position of the maximum shifts smoothly 
toward higher Z as the value of x (and therefore ) increases. 

The nesting property does not hold for plots of y’ vs x at constant 
L. This general consequence of the existence of a maximum in A’(L) 
is displayed in Fig. 19(b). All the curves in Fig. 19(b) have similar 
dependences on z. 
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Fig. 19 — (continued) 
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Fig. 20 — The value of y’ computed from the HTB coefficients of Model I vs 
the observed value, Y, for the 960-point HTB sample. 


Fig. 19(c) contains contours of constant y’ plotted in 2z,L space 
and completes the graphical summary. The contours surround the 
point « = 0, L = 1.46 at which the peak intensity occurs. 


7.3 Evaluation of Fit to the HTB Sample 


A summary indication of the quality of the fit of the 9-coefficient 
Model I to the HTB sample is given in Fig. 20, in which the fitted 
(computed) value, y’, is plotted against the corresponding observed 
value, Y. The solid straight line would represent the case of a perfect 
fit. This is impossible on the basis of a model using only 2,L 
coordinates since different Y values were observed for the same 2,L 
pairs. It is seen, however, that the scatter of the plotted points about. 
the line of perfect fit is reasonably uniform and that the horizontal 
width of the “scatterband” is roughly constant over the entire range of 4’. 
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In the following subsections, the quality of fit to the entire body of 
HTB data is scrutinized, using many of the procedures used in the 
previous section to evaluate the CB fit. 


7.4 Evaluation of Fit on Equator 


The HTB fit along the equatorial boundary is displayed in Fig. 21. 
The points are the values of observed Y plotted against D for all HTB 
data for which 0 S$ x < 0.087 (i.e., \ < 1°), and the plotted curve is 
A’(L), defined in (6), using the HTB coefficients of Table IV. Comparing 
Fig. 21 with Fig. 11, it is seen that most of the excess scatter has been 
eliminated. The curve in Fig. 21 does not deviate noticeably from the 
center line of the points (except for 1.5 < LZ < 1.6, where the curve is a 
trifle high and for LZ + 1.95, where the curve is a trifle low). 


10 


Y (points), A(LINE) 





Fig. 21— All the HTB data for z < 0.037 (ie., within 1° of the magnetic in- 
variant equator) and the equatorial value estimated from the HTB coefficients 
plotted against ZL. A’ and Y are in units of (counts/sec)*”. 
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Due to d.f.* Sum of squares Mean square 
Total 960 5374. 7320 
Mean 1 2121.1760 
Corrected total 959 3253. 5560 
Model 9 5340. 0645 593 . 3405 
Error 951 34. 66751 0.03645 
Coefficient estimates 
at Lo | a2 as 7 TL T2 T3 S 
Estimate 12.0702 | 1.13800 | 0.3006 | 0.7131 | 5.6190 | 0.2600 | —0.4937 | 0.3536 0.3221 
Standard error 5.1178 | 0.0010 | 0.1205 | 0.0765 | 0.3798 | 0.0090 0.0245 | 0.0190 0.0048 
a values 
a Lo az a3 n TL T2 3 S 
a, 0.138 0.97 —0.98 0.90 0.11 0.09 —0.08 —0.00 . 
Lo 0.4969 0.12 —0.13 0.11 —0.47 0.27 —0.15 —0.00 
x ay 0.9995 0.4783 —0.96 0.92 —0.10 0.09 —0.08 —0.00 
oS a3 —0.9998 | —0.4998 | —0.9990 —0.90 0.11 —0.09 0.08 0.00 
= n 0.9948 0.4705 0.9966 | —0.9941 —0.10 0.10 —0.09 —0.00 
a T) 0.4624 | —0.8541 | —0.4505 0.4617 | —0.4490 —0. 64 0.43 0.00 
Te 0.4261 0.6819 0.4199 0.4243 0.4268 | —0.9378 —0.72 0.00 
T3 —0.3951 ; —0.53855 | —0.3926 0.3929 | —0.4066 0.8219 | —0.9589 —0.00 
S —0.0571 | —0.0270 | —0.0616 0.0657 | —0.0680 | —0.1166 0.0653 | —0.0587 














* degrees of freedom 
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In Fig. 8 the solid curve, which is A’(Z) calculated from the HTB 
coefficients, may be compared with the dashed curve, which is A’(L) 
calculated from the CB coefficients. The HTB fit gives higher 
equatorial values for y’ when Z is less than 1.9, as might be ex- 
pected from the fact, displayed in Figs. 16(a) and (b) and discussed 
in Section 6.8, that the HTB data select the higher values of Y for 
14 <L< 1.6. For Z greater than ~1.9, the equatorial values of the 
HTB fit are somewhat lower than those of the CB fit; however, there 
is no equatorial data for LZ > 1.95, and the comparison of the fits is 
not meaningful in this region. The points in Fig. 8 are estimates based 
on CB, not HTB, data and are not immediately pertinent to the solid 
curve. 

An estimate of the standard error of the fitted equatorial function 
A’(L), based on the HTB sample, is plotted as a function of L in 
Fig. 22(a) (see Section VIII for details). The standard error of 
A’(L) is typically less than one percent in the range of Z (1.15 < 
EL < 1.95) over which equatorial data are available. Error bars of 
this size would hardly be visible in Fig. 21. For the same values of J, 
the standard errors of A’(Z) derived from the HTB fit are sub- 
stantially smaller than those from the L-slice fits listed in Table II. 
As might be anticipated, the percent standard error of A’(Z) in- 
creases as the minimum 2 values of available data increases with 
increasing L beyond L = 2. This increase to a value of 10 percent at 
L = 8 reflects increasing uncertainty in the extrapolation of the fit. 
Note that the curves in Fig. 8, which represent the equatorial values 
of CB and HTB fits, differ, in general, by substantially more than two 
standard errors and the difference is certainly “statistically signi- 
ficant.” 


7.5 Evaluation of Fit at Cutoff 


Figs. 23(b) and (d) show the positions, in 2,Z and R,\ coordinates, 
at which zero counts were observed during an 11-second counting 
interval. Figs. 23(a) and (c) are corresponding plots for one count 
(one, two, or three counts for Z < 1.5) per counting interval. Only 
HTB data are plotted, and the density of points at high Z has been 
reduced to improve the clarity of the display. 

Judgments regarding the quality of the fit are made, once again, 
with reference to the well-defined band of one count, rather than in 
terms of the more nebulous cutoff. The solid lines in Figs. 23(a) and (c) 
are the loci of y/(z, L) = V1 count/11 sec, using the HTB coefficients 
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Fig. 22 — The standard deviation of A, o4, and the standard deviation of z., 
oz,, 28 functions of L. Units of co, and oz, are the same as the units of A and z., 
respectively. (a) Model I. (b) Model II. 
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in the model. These lines represent the data well. Although the fit 
appears uniformly good in the z,Z representation, a slight weakness 
near the ‘‘corner” at \ ~ 40° is displayed sensitively in the R,d plot 
(see also Fig. 12). 

The dashed lines in Figs. 23(a) and (c) show the locus of the fitted 
cutoff function, 2.(Z), calculated from the HTB coefficients. Error 
bars indicating excursions of one standard error in x,(Z) are shown at 
two places on Figs. 23(a) and (c). The standard deviation of 2x,(L) 
as a function of Z has been estimated (see Section VIII), and is plotted 
in Fig. 22(a). This standard error is smaller than those produced by 
the L-slice fits at corresponding values of LZ (see Table II). 

The values of x.(L) for the HTB and CB coefficients are plotted in 
Tig. 9. Although there is no discernible difference between the two 
curves in the figure for L < 2, the difference between the tabulated 
values exceeds twice the standard error (which is very small) over 
much of the range of L. The two sets of coefficients thus lead to results 
which differ in a “statistically significant’? manner. For ZL less than 
ew2, the significance of the standard error is more readily understood 
when it is interpreted in terms of the altitude of the cutoff. This is 
done in Section XI. ; 

Beyond L = 2, the values of x, for the CB and HTB coefficients 
diverge noticeably, compare Figs. 12(a) and (c) with Figs. 23(a) and 
(c), respectively. The magnitude of this divergence is quite sensitive 
to the method used in selecting the samples to be fitted. As has been 
discussed, the concept of a cutoff is not well defined in the context of 
these measurements for L > 2. The uncertainty is reflected in the 
rapid rise in the value of the standard error of x,.(L) [see Fig. 22(a) | 
as I approaches 3. The significance of this rise may be more readily 
appreciated by referring once more to the error bars associated with 
x-(L) in Figs. 23(a) and (c). 

The partitioning of the data on the basis of electrical bias and tem- 
perature, and the procedure chosen for selecting the sample to the 
fitted, introduce statistically significant differences between the values 
of x.(L) obtained from the HTB and CB fits, as well as the more 
readily anticipated significant differences in the values of A’(L). 


7.6 Standard Error of Fitted Value 


The standard error for y'(z, LZ) is relatively constant, ranging be- 
tween 0.01 and 0.04, except close to x,(Z). It should be understood 
that this standard error is based on the fit to the HTB sample, and 
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L 


Fig. 23 — All positions for the HTB data in FR,» space (a) and a, L space (c) 
at which one count (one, two, and three counts for L < 1.5) was observed in an 
1l-second counting interval, and all positions in R, \ space (b) and 2,L space 
(d) at which zero counts were observed in an 11-second counting interval. The 
solid lines are the loci of positions at which the HTB coefficients estimate one 
count in 11 seconds. The trace R = 2.0 R., which explains the absence of data 
in the lower right-hand corner of the z,Z plots, appears in part (d). The dashed 
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Fig, 23 — (continued) 


lines are the loci of the cutoff function x.(Z) or R.-(L) calculated from the 
HTB coefficients. The cluster of points near R = 1.1 R, and A = 20° in part 
(b) of the figure is data acquired by the telemetry station at Woomera, Aus- 
tralia. They represent observations made near perigee when the satellite was be- 
low the bottom edge of the proton belt, which is high over the western Pacific 
Ocean. 
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thus applies to the estimate of the average value of y and does not give 
the standard deviation of a single predicted observation. The latter 
would be in the neighborhood of 70.04 = 0.2 (where 0.04 is approx- 
imately the MSE, see Table IV). 

Contours of constant percent standard error in the counting rate, 
y?, are shown by the curves in Fig. 24(a). For L < 2 the standard 
error is less than 2 percent except close to the cutoff, where the valuc 
of y? is falling fast. (Near the cutoff, the standard error in 2, is more 
informative.) In the absence of a fitted function, it would be neces- 
sary to average between about 30 and 300 observations to achieve a 
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Fig. 24 — Contours of constant percent standard deviation in the counting rate, 
y*, calculated from the fits to the HTB sample and plotted in z,L space. (a) 
Model I. (b) Model II. 
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Fig. 25—-HTB residuals of Y (i.e., Y — y calculated from the HTB coefficients) 
plotted against L. The arrows indicate + the approximate standard deviation if 
Y? were Poisson distributed. 


standard deviation as small as 2 percent. As discussed in Appendix 
B.4, the estimates of the standard deviation based on the HTB sam- 
ple are conservative and (if there: were no biases in the model) the 
values that apply to the 40,000 HTB points might be smaller than 
those in Fig. 24(a) by a factor as large as 6. 

The values in Fig. 24(a) are for relative counting rates (or fluxes) 
and do not include the uncertainty in the absolute calibration of the 
instrument noted at the end of Appendix A. Other discussion is given 
in Sections 9.4 and 12.2 and Appendix B.4. 


7.7 Behavior of the Fit on Several L-Slices 


Using the HTB coefficients, values of yz(x) were calculated for 
Im = 1.35, 1.805, 2.0215, and 1.79. The results are plotted as the 
heavy solid lines in Figs. 4 to 7. Recall that the points in these figures 
are not all HTB points. In general, the HTB points are those with the 
higher values of Y, although this may not be the case at L = 2.2 
because of the temporal effects discussed in Section X. The four 
figures also allow further appreciation of the difference in results 
between CB fit and the HTB fit produced by the partitioning of the 
data and the refinement of the procedure by which the sample was 
selected. 


7.8 Residuals in x,L Space 


The residuals, Y — y, were computed for all the HTB data using 
the HTB coefficients. Fig. 25 is a plot of residuals against L, and 
Figs. 26 and 27 are plots of residuals against x, in the indicated 
[-ranges. These plots are analogous to Figs. 13 to 15, and as they 
display properties similar to the earlier figures, the discussion of 
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1.40< L < 1.60 
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! 





x 


Fig. 26— HTB residuals of Y (ie., Y — y calculated from the HTB coefficients) 
plotted against z for 140 < L < 1 60. The arrows indicate + the approximate 
standard deviation if Y? were Poisson distributed. 


Section 6.6 applies. In particular, there is little indication of a de- 
pendence of the residuals on the magnetic coordinates. Moreover, the 
residuals in Figs. 25 to 27 are more closely clustered about zero than 
those in Figs. 13 to 15, confirming the fact that there is less scatter 
in the HTB data. This reduction in the scatter is especially marked 
in the neighborhood of the peak of the radiation belt (near x = 0 
between L = 1.4 and L = 1.6, Fig. 26). 


7.9 Mean Square Residuals in x,L Space 


A breakdown of the mean square residuals (MSR) by L-ranges 
for the fit to the HTB data is given in Table III. This analysis is 
analogous to that presented in Section 6.7 for the CB fit. For the 
Group I data the MSR for the overall fit (1.1 < L < 3.0) is about 
(1.5) (0.023) = 0.036 and the largest entry under HTB Group I is 
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0.059. The anomalous trend of the MSR near L = 1.4 evidenced in 
the fit to the unrestricted data (see Section 6.7) has been largely 
eliminated. The overall MSR for the Group I data has been reduced 
by 15 percent. 

The breakdown of the MSR by L-ranges is not a particularly 
refined test of the quality of the fit. This index is based on essentially 
all the HTB data and, because the averaging procedure is blind to 
the distribution of data within L-ranges, favors results that fit best 
where the density of data is high. As the HTB sample was selected 
using criteria dependent on the preliminary fit to the data and does 
not necessarily favor z,L regions in which large quantities of data 
were acquired, the results of fitting this sample does not produce the 
lowest obtainable value of MSR for all of the HTB data. Examina- 
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Fig. 27-——HTB residuals of Y (i.e., Y — y calculated from the HTB coefficients) 
plotted against xz for 1.85 < L < 1.90. The arrows indicate + the approximate 
standard deviation if Y? were Poisson distributed. 
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tion of the MSR in 2,L cells shows the effect of the sample selection 
procedure on the MSR in L-ranges. Appendix C contains further in- 
formation and analysis of MSR in 2,L cells. 

Model I with the HTB coefficients, provides a summary of the 
HTB data that, in the light of the many sources of variability and 
measurement errors, reasonably approaches the limit set by expected 
statistical fluctuations. 


7.10 Sources of Variability in the Data 


The residuals for the HTB data are now examined to see whether 
further identifiable sources of variability may be associated with 
them. Possible sources are: instrumental effects, errors in the ephemeris 
of the satellite, errors in the description of the magnetic field, telem- 
etry errors, fluctuations in the length of the counting interval, de- 
ficiencies in the model, and temporal variations. While all these must 
make some contribution to the MSR, the interrelationships among 
the coordinates discussed in Section 6.8 and the small size of the 
individual contributions, make positive identifications very difficult. 
We have not attempted to examine in detail the large number of 
small, apparently systematic, deviations discernible on the residual 
plots, although some of these may be “statistically significant.” In- 
stead we have restricted our study to effects which are readily ap- 
parent on the residual plots. Where the observations are dense, an 
effect would be glaringly apparent if it introduced a shift of ~ 0.05 
in the local mean of the residuals. (This corresponds to a change of 
about 1.2 percent in flux at the peak of the proton intensity, and 
about 12 percent when the flux is a hundredth of its peak value.) 

Instrumental effects are associated with temperature, bias voltage, 
radiation damage, and imperfections in the omnidirectional char- 
acteristics of the detector. Restricting the range of temperature and 
bias voltage removed the major fraction of the instrumental effects 
associated with these variables. Directional effects in the detector 
might show up when the residuals are plotted against y, the angle 
between the spin axis and the local magnetic field vector. However, 
no dependence was observed, indicating that the detector is effectively 
omnidirectional. Radiation damage, though technically an instru- 
mental effect, is more logically treated with temporal variations. 

Examination of plots of residuals versus various geographic co- 
ordinates did not reveal any systematic dependencies. In view of the 
small excess of the MSR over expectation for a random Poisson 
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process, and the existence of other sources of error, it seems reason- 
able to conclude that the ephemerides were computed with sufficient 
accuracy for this analysis. 

The plots of residuals against the geographic coordinates as well 
as against x and L values were used to judge the quality of the co- 
efficients used to calculate the magnetic coordinates L and zx. No 
systematic effects that can be attributed to flaws in the coefficients of 
the magnetic field were discerned. Nor is there any indication, in the 
form of excessive scatter of the residuals, that L is an imperfect 
coordinate in any part of the region of space covered by these data. 

Gross telemetry errors and those that occur in conjunction with 
noise bursts are easily identified and have been discarded. There 
remain telemetry errors that are indistinguishable from good data 
on a point-by-point basis, and these erroneous data must make some 
contribution to the scatter. As noted in Section 8.1, the distribution 
of the residuals has been looked into and they are found to be very 
well-behaved. However, it is not possible to make any quantitative 
estimates of the contribution of the remaining telemetry errors to the 
MSR. . 
Temporal variations are an important source of variability, and 
Section X is devoted to their analysis. 


VIII. STATISTICAL CRITIQUE OF MODEL I. 


This section presents further information on statistical evaluation 
of the Model I fit. (Some background concerning relevant statistical 
techniques is given in Appendix B.) While confirming the very satis- 
factory performance of Model I in fitting the data, as presented in 
Section VII, some unsatisfactory aspects are uncovered and several 
defects of the model are pinpointed. The rectification of these defects 
is effected by use of Model II, discussed in Section IX. . 


8.1 Fit of Model I to the 960-point HTB Sample 

The analysis of variance for the fit of Model I to the 960-point HTB 
sample is shown in Table IV. This gives various partitionings of the 
total sum of squares (about 0) of the 960 obscrvations (on the square 
root of counting rate scale). Table IV indicates the relevance of the 
model to the data in terms of its statistical effectiveness. Fitting the 
nine coefficients of the model accounts for more than 99.3 percent of 
the total sum of squares of the observations, leaving less than 0.7 per- 
cent associated with “error” or lack of fit. On a per degree-of-freedom- 
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basis, the ratio of mean square for “fitted model” with 9 degrees of 
freedom to mean square for “error” is over 16,000. 

Of course, simply fitting the mean of all the data accounts for a 
sum of squares of 2121.2 of the total of 5374.7. Of the remaining “cor- 
rected” total sum of squares about the mean of 3253.6, the part of the 
model “orthogonal” to the mean accounts for 3218.9, i.¢e., approxi- 
mately 98.9 percent (so that the squared multiple correlation coefficient, 
R?, is 0.989). The corresponding ratio, mean square for the model with 
(9-1) = 8 degrees of freedom to mean square for error, is over 11,000. 

It is worth emphasizing that the sample selection process which was 
used (see Section 7.1) is such that fitting the sample is, on a per ob- 
servation basis, a more challenging problem than it would be for the 
entire body of data (see Appendix B.3). 

A summary graphical indication of the appropriateness of the fit is 
given in Fig. 20 which shows the fitted value plotted against the ob- 
served value. A perfect fit (essentially impossible here with any model 
based on 2,Z coordinates because different integral values of Y are 
observed near the same x, point) would be the diagonal straight line 
shown. Deviations from fit should be gauged as horizontal spread about 
the line, since the observed quantities are plotted as abscissa, and are 
seen to be reasonably uniform throughout. 

Incisive indication of the quality of fit was provided by various 
plots of residuals (against L, x, y, time, etc.). Some representative 
plots over all the HTB data are shown in Figs. 25 to 27 and Figs. 41 to 
43. 

As a further examination of the adequacy of the fit to the selected 
HTB data, normal and half-normal probability plots (see Appendix 
B.8) were prepared for the 745 residuals comprising the subset of the 
960-point HTB sample for which x < x,(L). These plots are shown in 
Figs. 28 and 29. 

Fig. 28 does display a generally good linear configuration indicating 
that the residuals may reasonably be regarded as a sample from a nor- 
mal distribution. There is no suggestion of general asymmetry or other 
distributional peculiarities. There are perhaps three values which are 
statistically “too large,” but not wildly so. Indeed, the plot is remark- 
ably well-behaved and reassuring. 

From some points of view, it is useful to consider the statistical be- 
havior of the residuals without regard to their sign. Fig. 29 is a plot 
of the ordered absolute residuals against standard half-normal (folded 
standard normal) quantiles. This presentation is more focussed and 
sensitive to a statistical overabundance of large absolute residuals. The 
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plot is also very well-behaved, with indication of the same three overly 
large values. 

The reason for omitting from these plots all residuals from points 
for which x > x,(Z) is that, for those, the predicted value y is 0 and, 
in the great majority, the observed Y was 0; hence, the residual is 0. 
Since it was exactly this information which determined the estimate 
x,-(L) and since one could hardly expect a collection which includes 
about Ys zeros to behave like a normal sample, these points were omit- 
ted 

From either Figs. 28 or 29 one can estimate a slope of about 0.21, 
which is an estimate of the standard deviation of the (counting rate)? 
observations, clear of the confounding influence of the nonvariance- 
stabilized very low counting rate observations, since observations for 
z > 2,(L) have been omitted. The corresponding variance estimate, 
0.044, clearly exceeds that from the Poisson approximation, 0.023, 
and also is greater than the pooled value for the MSD(Y), 0.039, 


ORDERED RESIDUALS 





-2.8 1.4 10] 1.4 2.8 
QUANTILES OF STANDARD NORMAL (GAUSSIAN) DISTRIBUTION 


Fig. 28 — Normal probability plot of residuals from fit of the model to the 
HTB sample. 
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(Appendix C) the overall HTB data MSR(Y), 0.038, (Appendix C) 
as well as the MSI(Y) from the fit to the 960 points, 0.036,.(Table IV). 
This is as one would expect, since the variance estimate from the slope 
of Figs. 28 and 29 is not downward biased by the zero (and V1/11) 
residuals from the very low counting rate observations for x > 2.(L), 
while the other quantities are so biased. 

The excess of the variance estimate of 0.044 over the Poisson value 
of 0.023 may be due to any or all of several factors, including: (7) the 
noncorrectness of the Poisson assumption, (77) temporal variations’ in 
the radiation belts or the detection equipment, (277) measurement 
errors or computational biases in time record, ephemeris or magnetic 
coordinates, etc. (qv) noise bursts—the outlandish values were detected 
and discarded, but the general effect must be an upward bias on varia- 
tion, and (v) inadequacies in the model, including analytic form and 
coordinates employed. 
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Fig. 29— Half-normal probability plot of absolute penenes from fit of the 
igdel to the HTB sample. - 
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8.2 Statrstical Measures Over All the HTB Data. 

An extensive presentation and comparison of various functions of 
the residuals over all the HTB data is given in Appendix C. Those re- 
sults provide (2) an empirical justification for the use of the square 
root transformation; (iz) a strong indication that the fit attained by 
Model I cannot be improved very much in the least squares sense over 
all the HTB data; (zi) information on the extent of “unevenness” of 
the cell-construction process by which the 960-point HTB sample was 
selected; and (iv) some indication of differential effectiveness of fit of 
Model I fo the data for different x,L regions. 


8.3 Statistical Properties of Estimates of the Coefficients and Coefficient 
Functtwons. 


_ The least squares estimates of the nine coefficients of Model I fitted to 
the 960-point HTB sample are given in Table IV, with their approxi- 
mate standard errors and pairwise correlations.* These provide the 
information needed to obtain estimates and standard errors for func- 
tions of the coefficients; e.g., y’(2,L), or A’(L), or the value of the max- 
imum counting rate, or the position in space at which the intensity of 
high energy protons is maximum, etc. (See Appendix B for the neces- 
sary formulae.) 

Some of the pairwise correlations in Table IV are exceedingly high. 
This may be due, in general, either to an unfortunate “design” (i.e., 
the array of positions of observations in 2, space in this application) 
or to some inherent “coefficient redundancy” in the model, or to both 
such blemishes. Occurrence of such near-singularities can lead to prac- 
tical difficulty with the iterative fitting computation and/or make the 
individual coefficient estimates poorly determined. 

In the present model, only the coefficient Lo has a direct physical 
interpretation. Its estimate has a very small standard error and an 
entirely bearable correlation with the remaining coefficient estimates 
(all values of |a| < 0.5). Otherwise, physical interest centers mainly 
on the coefficient functions A’(Z), x,.(Z), and y’(x,L) whose estima- 
tion is considered in Sections 7.2, 7.4, 7.5, 7.6, and 8.4. 

For a given model and specified coefficient values, the matrix of ap- 
proximate correlations depends only on the array of data positions in 
x,L space. Thus, to check on whether the correlational problems might 


* A rescaling of the values of p, namely as the quantity a defined and moti- 
vated in Appendix B.5, is also given in Table IV. The coefficient of dependence 
a has more nearly the behavior of a “linear utility function.” 
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be due to inadequacy of the practically available (selected) array, a 
correlation matrix was computed using an ‘ideal’ x,L array, namely 
the 1034 values of (z,L) corresponding to the division of 2,L space 
described in Section 7.1 and Appendix B.3. While some minor improve- 
ments in some of the correlations were noted, the changes were small. 
Thus, it would appear that the main reason for the high correlations 
is in fact some “coefficient redundancy” in the model. 

Inspection of Table IV indicates that the very large correlations are 
associated with some of the parameters of the A’(L) function, namely 
1, G2, ag, and y for all pairs of which |p| > 0.99 (i., ja] > 0.90). 
Moreover, it will be seen in Section 8.5 below, that the present param- 
eterization of the model leads to a markedly large indication of non- 
linearity and there is reason for believing that this is largely due to 
the same subset of coefficients. The combination of both defects stimu- 
lated development of Model II which overcame them (see Section IX). 


8.4 Estimates of Functions of the Coefficients 


The estimates of the coefficient functions A’(L) and x,.(Z) have been 
discussed in Sections 7.4 and 7.5 and summarized in Figs. 10 and 11. 
Their estimated standard deviations, on a ‘‘pointwise” basis, are 
graphed in Fig. 22(a), while the approximate correlations of the esti- 
mates of A’(L), x.(L), and S, as functions of L, are shown in Fig. 30(a). 

Despite the near-singularities (i.e, |p| near 1) in the estimates 
of some of the individual coefficients of A’(Z), it is seen that the estimate 
of the square root of the equatorial counting rate provided by A’(L) is 
well-determined over the entire Z range. The standard error varies 
between approximate limits of 0.018 and 0.040, nonmonotonically, and 
these values are typically less, sometimes by a factor of 5 or more, 
than the standard errors from the corresponding L-slice estimates 
(see Table II) reflecting in part the statistical gain from the simul- 
taneous two-dimensional fit. 

For x.(Z), the standard error is less than 1 percent over much of 
the range of ZL, rising to 3 percent for large L values where the data 
are statistically inadequate. 

The three correlation functions p4,.,(L), pa,s(Z), and ps,2.(L), for 
the estimated coefficient functions A’(Z), x.(Z), and S, are plotted in 
Fig. 30(a) (see Appendix B.4 for formulae). In general, these correla- 
tions are small (|p| < 0.5, |a] < 0.12). The statement applies to 
the correlations involving A’(Z) despite the very high correlations among 
individual coefficients. The generally low correlation between A’(L) 
and x,(Z) is as intuitively expected since A’(Z) is influenced mainly 
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Fig. 30 — Correlation coefficients of A with S, S with z., and A with 2, cal- 
culated from the fits to the HTB sample and plotted as functions of L. (a) Model 
I. (b) Model II. 
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by observations at small x while x,(Z) is determined mainly by those 
at large x. The exception is near L = L,, where py,.,(L0) approaches 
1 as a result of the fact that the coefficient LZ, is common to both func- 
tions and that the forms of A’(Z) and z,(L) [see (4), (5), and (6)] 
require that both functions be zero when L = Ly. 

The statistical correlation between the fitted A and z, for the L-slice 
fits was always positive (see Table II), which is not the case for p,,.,(L). 
This change in sign gives some indication of basic differences in be- 
havior between the results of the two-dimensional fit and the outcome 
of the collection of one-dimensional L-slice fits. 

The (A, S) and (S, x.) correlations have the same signs in all cases. 
The magnitude of the correlations among A, x,, and S is larger for 
the L-slice fits (see Table II) than for the HTB fit at corresponding 
values of L [see Fig. 30(a)]. This is very noticeable for L greater than 
1.7. particularly for the large correlation between S and x,. It is 
these large correlations which make it difficult to obtain reliable L-slice 
estimates of x, or S when L,, > 2 (see Fig. 6) or when the distribution 
of the data within an L-slice is poor (see Fig. 7). 


8.5 Nonlinearity Indices and Dependence of Estimates 


Appendix B.5 discusses the use of the sum of squares function (.e., 
sum of squares of differences between observed value and “fitted” 
value, as a function of proposed coefficients) as an indicator of the 
joint dependence and behaviour of the coefficient estimates and the 
fact that the extent to which the contours of the sum of squares func- 
tion are approximated by a certain family of ellipsoids provides a meas- 
sure of linearity of the model. 

Fig. 31 shows 4 of the 36 pairwise projections of the 9-dimensional 
ellipsoid, whose size would correspond to a “0.99 joint confidence co- 
efficient” as discussed in Appendix B.5. The axes are scaled in each 
case according to the standard error of the coefficient. The orientation 
and shape of the ellipse corresponds directly to the sign and magnitude 
of the correlation, p, or its transform, a, for the pair of coefficients. 
Thus, for example, Fig. 31(a) shows the projection onto the a,-as 
plane. The resulting very narrow positively inclined ellipse corresponds 
to a very high positive correlation of a1, a3 (p = 0.9995, a = 0.97). 
(The 45° inclination of the graphed ellipses is a result of scaling the 
axes by their standard errors.) Part (b) of the figure shows a narrow 
negatively inclined ellipse for the case of rather large negative correla- 
tion between a3 and y estimates. Parts (c) and (d) illustrate results for 
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small and negligible correlations between Lo, re and rz, S, respectively. 

At various positions on these ellipses there appear numbers which 
are ratios of the actual sum of squares at that “point” to the minimum 
sum of squares. The computation of the actual sum of squares is done 
for the coefficient values corresponding to the point on the 9-dimen- 
sional ellipsoid which projects into the point on the plotted ellipse. 

If, in fact, the coefficients occurred linearly, all of these numbers on 
all of the pairwise ellipses would be constant and in the present case 
would have the value 1.023 corresponding to a sum of squares of resid- 
uals of about 35.47. As a basis for judging the actual values and their 
variability, the following table gives values which this ratio would 
have, if the coefficients did occur linearly, for various joint (9-dimen- 
sional) “confidence coefficients:” — 


Conf. Coeff. Contour Ratio 
0.90 1.015 
0.95 1.018 
0.99 1.023 
0.999 ; 1.029 


In view of the variability of the actual ratios in Fig. 31, and of the 
extent to which some depart from the values in the above table, it is 
clear that in the present form of the model the coefficients behave 
jointly in a markedly nonlinear fashion even in a relatively small 
neighborhood around the least squares estimate. 

Inspection of the entire set of (9) (8) /2 = 36 pairwise plots strongly 
suggests that a major part of this nonlinear behavior derives from the 
coefficients a1, de, d3, and y of the A’(L) part of the model. These also 
are the coefficients whose estimates exhibit the undesirably high cor- 
relations which have been shown above to be due mainly to a “coeffi- 
cient redundancy” in the model. 

Direct interpretation of the ellipses in Fig. 31, as indicating inter- 
dependence of the coefficient estimates, depends heavily on the appro- 
priateness of the linear approximation in the neighborhood of the least 
squares estimate. Since the nonlinearity index is in fact distressingly 
large one must be cautious in interpreting the ellipses or their asso- 
ciated correlation or dependence coefficients. 


8.6 Summary Statistical Criticisms of Model I. 
Model I, with coefficients determined by fitting to the 960-point 
HTB sample, has been shown to provide a very good fit both to the 


1382 THE BELL SYSTEM TECHNICAL JOURNAL, SEPTEMBER 1967 


, = 12.070 + 5.162 


a 

a 
“a 
a 


3 = 0.301 £0122 


p 
« 












= 0.713 £ 0.077 
5.619 + 0.38! 


—0.9941 
-0,90 


34 


709.0 


| 
910.0 
-5 
-5 0 5 
a3 


Fig. 31— Examples of projections of the approximate “0.99 joint confidence 
region” for the estimates of Model I. (Axes are scaled by standard errors.) 
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sample and to the entire body of some 41,000 HTB observations. 
Moreover, the interesting coefficient functions y’ (z,L), A’(L), and x,(L) 
have stable statistical properties as has the physically interpretable 
coefficient Lo. 

However, the model has two statistical defects: Firstly, although 
the model gives an extremely good fit to the data, the parameters 
@, A2, 43, and of the A’(L) part of the model have exceedingly high 
iti correlations (see Table IV), and these were shown not to be 
due to an obviously defective design. Secondly, the model coefficients 
exhibited distressingly high nonlinearity of behavior even within 
rather close neighborhoods of their least squares estimates, with 
grounds to suspect that this was caused by the a1, dz, @3, y group of 
coefficients. In addition, most of ‘the coefficients of Model I do not 
have any directly meaningful physical interpretation. 

The modifications which led to Model II, as Aiecuaeea: in the 
following Section IX, overcome these defects of Model - while re- 
taining all its virtues. a 


ach 
Ix. THE MODEL II FIT TO THE HTB DATA 


This section presents the statistical analysis of the HTB data 
using Model II, a modified version of Model I. The emphasis in the 
presentation is on comparisons of Models I and II. Since it is shown 
how very closely the fit of Model II approximates that of Model I, 
such aspects as the direct presentation of Model II residuals overall 
the data are unnecessary, and hence omittted. 


9.1 Model II 


The definition of Model II has been given in Section 4.6, together 
with a discussion of the physical interpretation of its coefficients and 
its mathematical relation to Model I. Specifically, the 8-coefficient 
Model II constitutes a specialization and reparameterization of the 
9-coefficient Model I. Thus, it follows that the minimum sum of 
squares in fitting Model II to any body of data can not be less than 
that from fitting Model I, though this may not be true of the mean 
square error. 

The evolution of Model II from Model I did not arise from any 
simply described systematic process, as is indeed true in other aspects 
of this study. Once the basic achievements of Model I were estab- 
lished it was then opportune to focus on major remaining defects. The 
character of these defects strongly urged elimination of one or more 
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coefficients in conjunction with a nonlinear reparameterization of 
the coefficients. The solution achieved was arrived at by empiricism, 
persistence and good luck. 

The remainder of this section documents the assertion that Model 
II retains all the virtues of Model I while overcoming its defects. 


9.2 The Fit of Model IT to the 960-point HTB Sample. 

The analysis of varianee from fitting the 960-point HTB sample 
by means of the 8-coefficient Model II is given in Table V. As ex- 
pected, the residual sum of squares, 34.7126, of Table V exceeds that of 
Table IV, namely 34.6675. This difference is associated with the 
one-degree-of-freedom nonlinear constraint defined in (13). Thus, 
we see that the sum of squares associated with the one-degree-of- 
freedom non-linear constraint is (34.7126-34.6675) = 0.0451 and this 
gives a ratio of less than 1.24 in relation to the mean square error 
of 0.03645. The value 1.24 corresponds to the upper tail 27 percent 
point of the chi-squared-with-one-degree-of-freedom distribution. 
The proportionate increases in the sum of squares for error is about 
0.18 percent and the increase in the mean square error is less than 
one part in 3000. Multiple R? = 0.989 is effectively unchanged. 

For the models of both Tables IV and V, the coefficient S is treated 
as constant with L. If Model II is modified so that S(L) = so + s1L, 
then, fitting this 9-parameter version of Model II yields a sum of 
squares for error of 34.520. Thus, we would have a sum of squares of 
(84.713-34.520) = 0.193 associated with the “hypothesis” s,; = 0. 
The main point of quoting this result is to indicate that these minor 
differences in the sums of squares for error are judged as unimportant 
in this context, even if under some highly formalized assumptions the 
distinctions are “statistically significant.” 

Of greater interest and sensitivity are the following considerations: 
(1) the behavior of the residuals from Model II as functions of 2,L 
and y; (tz) the behavior of the differences between Models I and II; 
(wi) comparisons of the estimates of A’(Z) of Model I and A” (LZ) 
of Model II [see (6) and (11)]; (2v), comparisons of the estimates of 
x.(L) from the two models; (v) the pattern of correlations of the 
estimates of the eight Model II coefficients; and (v2) the indices of 
nonlinearity for the es of Model II. —— 


9.3 Residuals of Model I I Fit and Differences Between M cer I and IT. 
Figs. 32, 33, and 34 are plots of the residuals of the 960-point HTB 
sample from the fitted values. of Model IT against L, x and Y, re- 


TABLE V— Fit or Mopet II to 960-Poinr HTB Sampte. 


Analysis of variance 


Sum of Mean 
Due to d.f.* squares square 
Total 960 5374. 7321 
Model 8 5340.0195 667 . 5024 
Error 952 34.7126 0.0365 


Coefficient estimates 


Ap Lo Lp n Th re T3 S 
Estimate standard 8.0762 1.1293 1.4644 5.2187 0.2658 | —0.5082 0.3638 0.3225 
error 0.0342 0.0009 0.0016 0.0474. 0.0083 0.0241 0.0196 0.0047 
a values 
Ap Io Lp n 1 T2 T3 S 
Ap 0.01 0.00 0.02 —0.02 0.01 —0.02 0.21 
Lo 0.140 —0.05 —0.11 —0.41 0.19 —0.09 0.00 
8 Lp 0.046 —0.304 0.39 0.00 0.00 —0.01 0.01 
= n 0.203 —0.451 0.794 0.02 —0.00 —0.01 —0.00 
a T1 —0.181 —0.806 0.045 0.214 —0.62 0.39 —0.01 
> Te 0.162 0.589 0.090 —0.021 —0.923 —0.71 0.00 
T3 —0.171 —0.422 —0.151 —0.107 0.793 —0.956 —0.00 


S 0.609 0.015 0.099 —0.010 —0.168 0.098 —0.085 


* degrees of freedom 
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spectively. These plots show no systematic structure and are quite 
similar to analogous plots for Model I. Furthermore, Fig. 35, showing 
the observed Y versus fitted y” for Model II, is as well-behaved as 
the corresponding Fig. 20 for Model I. 

Figs. 36, 37, and 38 show the deviations between the fitted Models 
I and II plotted against L, x, and Y, respectively. Of course these 
figures show a systematic structure since one is plotting the difference 
of two smooth functions. However, the actual differences are totally 
insignificant in the light of the data. (Note that the scale for Figs. 
36, 37, and 38 differs from that of Figs. 32, 33, and 34 by a factor of 
10.) 

Thus, on the basis of one less coefficient, Model II fits the data 
essentially as well as Model I, to which indeed it is a very excellent 
approximation. It has the merit that the physically arbitrary coef- 
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Fig. 32 — Residuals (Y — y) from the fit of Model II to the 960-point HTB 
sample vs L 
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Fig. 33 — Residuals (Y — y) from the fit of Model II to the 960-point HTB 
sample vs. x. 


ficients a;, a2, and a3 of Model I have been replaced by A, and ZL, which 
do have direct physical interpretations. As will be detailed in the 
next subsection, Model II also has additional attractive statistical 
attributes. 


: 9.4 Coefficient Estimates 


Table V gives the least squares estimates of the cight coefficients of 
Model II together with their approximate standard errors, correla- 
tions and a values. The estimates are seen to be extremely well- 
determined. In particular, for the physically meaningful quantities 
A,, Lo, and L, the standard errors are about 0.4, 0.1, and 0.15 percent, 
respectively, while for the shape coefficients 7 and S they are about 
1 and 1.5 percent, respectively. a 

Comparison with Table IV shows that the standard crror has de- 
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creased for every coefficient which is common to the models. The 
most dramatic change is for for which the standard error diminished 
by a factor of about 8. 

The estimates of A’(L) and A” (L) are in very close correspondence 
as implied by Fig. 36. The comparison of Fig. 22(b) with Fig. 22(a) 
indicates that the standard error of A”(Z) is uniformly lower than 
(but in general agreement with) that of A’(Z). 

Entirely similar remarks apply to comparison of estimates of x,(Z) 
from Models I and II, as also documented by Figs. 22(a) and 22(b). 

It has already been shown that the fitted values of y’(z, L) and 
y’ (a, L) are in very close agreement. The pattern of contours of the 
percent standard errors of [y”’(«, L)]*, in Fig. 24(b), shows that the 
standard error is everywhere smaller than the corresponding results 
for Model I, in Fig. 24(a). 
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Fig. 34 — Residuals (Y — y) from the fit of Model II to the 960-point HTB 
sample vs Y 
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One of the most dramatic changes between Models I and II is 
indicated by comparison of the correlations in Tables IV and V. The 
very large correlations (|p| > 0.99, |a| > 0.9) among the A’(L) coef- 
ficients of Model I do not occur for Model II. Only the (71, r2) and 
(r2, 73) coefficient pairs of Model II have |a| values above 0.5. This is 
inconsequential since these are physically arbitrary coefficients of a 
cubic polynomial. 

The correlations of A” (L), x,.(L), and S from Model II remain much 
like the corresponding results for Model I, as shown in Fig. 30. 


9.5 Nonlinearity Indices 


The further virtuosity of Model II is indicated by the behavior of 
the nonlinearity index shown for the examples of “confidence regions”’ 


y” (COMPUTED) 





Y (OBSERVED) 


Fig. 35 — The value of y” computed from the fit of Model II vs the observed 
value, Y, for the 960-point HTB sample. 
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Fig. 36 — Deviations between the Model-I fit, y’, and the Model-II fit, y”, vs 
L, for the 960-point HTB sample. 


in Fig. 39. (See Appendix B for general discussion and definition.) 
Specifically, it is seen that the numbers on the ellipses vary very 
little and this is true for all 28 of these ellipses. These numbers would 
be constant and all equal to 1.023 if the model were linear in the 
fitted coefficients. Comparatively, Model II does indeed behave in a 
reassuringly linear fashion. For sharp contrast, we may compare Fig. 
39 with Fig. 31, for Model I, in which the values range up to 1000 
around the 9-dimensional ellipsoid. 

The nonlinear behavior of Model I in relation to the linear be- 





Fig. 37 — Deviations between the Model-I fit, y’, and the Model-II fit, y”, vs 
z, for the 960-point HTB sample. 





Fig. 38 — Deviations between the Model-I fit, y’, and the Model-II fit, y”, vs 
Y, for the 960-point HTB sample. 


havior of its specialized reparameterized version, Model II, is in- 
dicative of the reason for the high nonlinearity indices for Model I. 
Effectively, a p-coefficient model defines a constraining “surface” 
of p dimensions (p is 9 and 8 for Models I and II, respectively) in 
the n-dimensional space of the observations (n is 960 in the present 
case). In a small neighborhood of the least squares estimate, this 
p-dimensional surface may or may not be planar. If the latter, one 
will obtain high indices of nonlinearity. If the former, then one will 
or will not obtain high nonlinearity indices according to whether 
the individual coefficient coordinates within the p-dimensional surface 
are or are not linearly behaved. 

It is likely that the 9-dimensional surface defined by Model I is 
indeed reasonably planar, but the coordinate system defined by the 
coefficients is highly nonlinear. 

The correlation and nonlinearity effects, it should be noted, are not 
in principle related. One can have very high correlations with linear 
models and very low correlations with very nonlinear ones. 


9.6 Summary Comments 


Model II has been presented and validated as an evolution of Model 
I. Though Model II represents the current recommended fit from 
this study, several aspects of its justification, and of other comparisons 
in this paper, are based on the Model I fit. For example, the statistical 
study of residuals over all the HTB data, discussed in various places 
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including Appendix C, is based on Model I. This hybrid attitude is 
entirely sound, since the range of deviation between Models I and II is 
small compared to the range of residuals from the fitted sample. 

Thus Model II provides a fit to the HTB in which the 8 estimated 
coefficients provide a “good description” of about 41,000 observations. 
The deviations of the fit from the data are within reasonable statistical 
fluctuations—variation in telemetered counting rates, orbital errors, 
observational errors, mapping-to-magnetic-coordinate uncertainties, 
ete. (See Appendix C.3). A number of the coefficients have physical 
interpretations and these are statistically well-determined and rela- 
tively uncorrelated. Model II, though nonlinear in the coefficients, 
behaves in a very linear fashion in the neighborhood of the least 
squares estimates. 


X. TEMPORAL VARIATIONS 


This section and the two to follow are devoted to discussion of 
some specific physical results of the analysis. 

Temporal variations are considered in three classes: diurnal (day- 
night) , secular, and short term. Residual plots were used to study these 
effects. 


10.1 Diurnal Effects 

The HTB residuals were plotted against local time for various 
x,L regions. The HTB data are not well-distributed in local time 
near the magnetic dipole equator, making it difficult to draw firm con- 
clusions. However, no evidence of a diurnal variation was found. 

Specifically, to produce a change of about two percent in the 
average value of Y on the equator (x = 0) would require a diurnal 
shift in the radial position of the magnetic field line of about 0.01 
FR, at L = 1.35, and a shift of about 0.02 R, at L = 1.55, if there 
were no other effects. At these two positions, the value of y is large 
(y = 8) and dy/dL is large, and a two-percent change in y would 
correspond to a shift in the mean of the residuals of 0.16 between 
noon and midnight local time. An effect of this magnitude would be 
readily observable on the residual plots. 

Thus, it is unlikely that displacements larger than 70 km and 140 
km, at equatorial L’s of 1.35 and 1.55, respectively, would escape de- 
tection, and these distances are offered as upper limits to the day- 
night changes of the magnetic field at the two positions. As both of 
these displacements are equivalent to a change in field strength of 
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Fig. 39 — Examples of projections of the approximate “0.99 joint confidence 
region” for the estimates of Model II. (Axes are scaled by standard errors.) 
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about 300 gamma (0.008 gauss), this particle experiment does not 
qualify as a sensitive indicator of adiabatic changes in the earth’s 
magnetic field. 


10.2 Secular Effects 


The HTB residuals are plotted against elapsed time, in days, for 
1.85 < L < 1.90, in Fig. 40. It would appear that the average value of 
Y decreased between days 191 and 255. This decrease is exhibited in all 
parts of the belt where we have measurements during this interval. 
Between days 191 and 225, the orbit of the Telstar® 1 satellite did not 
take it into the central region of the belt {1.8 S$ LD S$ 1.8, S$ 10°}. In 
other regions the decrease in the average value of Y over this period is 
about ten percent. The extremes are two percent and 20 percent, but it 
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Fig. 40 —HTB residuals of Y (i.e, Y — y calculated from the HTB coefficients) 
plotted against time for 1.85 < L < 1.90. The arrows indicate - the approximate 
standard deviation if Y? were Poisson distributed. 
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is not possible to separate out other variables which may be influencing 
the results. 

From the magnitude of this effect, it is clear that it must be con- 
tributing substantially to the MSR. A decrease of ten percent in the 
average value of Y corresponds to a decrease of about 20 percent in 
the flux. A fractional change in the flux which is independent of x 
and Z cannot be distinguished from a change in the characteristics 
of the instrument. Among other possibilities, radiation damage or the 
decay of protons which might have been associated with the Star- 
fish high-altitude nuclear test of July 9, (day 190) 1962 might have 
produced the observed effects. Because of this ambiguity, we are 
unable to offer any well-founded interpretation of the time depend- 
ence of the data before day 225. For reasons to be noted shortly, 
ambiguities are also encountered when interpretation of the temporal 
behavior of data acquired after day 400 is attempted. In the inter- 
mediate period, the time dependence does vary with x and L. By 
using Fig. 40, which shows comparatively little fluctuation during 
this intermediate period, as a standard we are able to measure 
relative changes in the belt. The stretches of sparse data near days 
240 and 320 in Fig. 40 are a result of the orbital configuration, there 
being less opportunity to acquire “high-temperature” data during 
these periods. The absence of HTB data between day 325 and 373 
was caused, as noted in Section 6.9, by the low bias condition that 
existed during that time. However, an examination of residuals from 
the CB fit between days 325 and 373 reveals nothing that vitiates the 
conclusions drawn from the HTB data in what follows. 

Residuals versus time-in-days have also been plotted for x,L cells 
of size 0.1 in L by 0.2 in x. Below L = 1.9 we find only one change 
with time within the sensitivity of our measurements, namely, a 
secular decrease between days 225 and 400 which occurs only near the 
ends of the field lines (@ = x, — 0.2). We are unable to quantify this 
effect because, in order to see the droop above the noise, we need to 
collect residuals from a fairly sizable region of space. The term “‘sizable”’ 
means a region over which y changes so much that an average value of y 
in the region is not sufficiently representative to be used as a basis for 
computing a percent change in the flux. Fig. 41 gives an example of an 
x,L cell near the cutoff where this decrease may be seen. However, in 
the adjacent lower-z region, Fig. 42, where the ability to discriminate 
absolute changes in the average value of Y is the same and the ability 
to discriminate percent change in the average value of Y is much greater 
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Fig. 41— HTB residuals of Y AS e., Y — y calculated from the HTB coefficients) 
plotted against time for 1.6 < L < 1.7 and 0.8 < x < 1.0. The arrows indicate 
-++ the approximate standard deviation if Y? were Poisson distributed. 


than for the region of Fig. 41, no corresponding secular decrease be- 
tween days 225 and 400 is evident. 

The droop in the residuals after day 400, which is noticeable in 
Fig. 42, is characteristic of many of the plots of residuals versus 
time-in-days. The widespread occurrence of this effect confuses in- 
strumental and “real” variations and introduces unresolvable am- 
biguities when attempts are made to identify the source of the droop. 

The observation of the general downward slope in Fig. 41 might 
be explained by a small decrease in x,, which corresponds to a small 
increase in the altitude of the cutoff, between August 1962 and 
January 1963 on L-shells below 1.9.2% Alternatively, one might be 
observing the decay of the 55 MeV protons whose perturbation by 
the Starfish high-altitude nuclear test of July 10, (day 190) 1962 and 
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subsequent behavior have been measured by Filz?* near the bottom 
of the trapped proton belt. There are too few data for us to attempt 
further interpretation of this qualitative observation concerning the 
secular behavior of x,. The number of points affected and the mag- 
nitude of the shift are too small for this effect to contribute interest- 
ingly to the MSR. 


10.3 Short-Term Effect 


The plots of the residuals versus time-in-days, for 2,L regions, 
show a short-term fluctuation which is sufficiently singular to be re- 
ferred to as an event. This event is an increase in the average value 
of Y over the 30-day period which starts about day 280. It can be 
seen clearly in Fig. 43. The increase is discernible only for L > 1.9. 
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Fig. 42— HTB residuals of Y (i.e., Y — y calculated from the HTB coefficients) 
plotted against time for 1.6 < L < 1.7 and 0.6 < x < 0.8. The arrows indicate 
+ the approximate standard deviation if Y? were Poisson distributed. 
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Taste VI— FRACTIONAL INCREASE IN FLUX BETWEEN Days 280 
AND 310, 1962. 





Table VI gives the fractional increase in the average counting rate 
(Y?) during this period at various values of x and L. By L = 2.25 
the change is barely observable and for L > 2.3 it has disappeared. 
The data acquired between days 325 and 373, which are not included 
among the HTB data because the bias voltage was low, were ex- 
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Fig. 43—-HTB residuals of Y (i-e., Y — y calculated from the HTB coefficients) 
plotted against time for 2.0 < L < 2.1 and 0.6 < 2 < 0.8. The arrows indicate 
++ the approximate standard deviation if Y? were Poisson distributed. 
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amined; and there appears no reason to believe that there were any 
changes in the intensity of the >50 MeV protons for L > 1.9 during 
these 48 days. 

While it is not possible to be quite sure that we are observing a 
“true” temporal effect, it is difficult to contrive any alternate ex- 
planation. This event can be compared with the changes produced in 
the high energy proton distribution by the magnetic storm of Septem- 
ber 22, 1963, and observed with Relay 1*° and the Telstar® 2 satellite.’ 
In both cases only ZL shells with values above 1.9 were affected, and the 
effect is more pronounced at higher z’s. However, the storm produced a 
decrease in flux whereas an increase was observed in 1962; the effects of 
the storm were more severe at larger L’s, whereas in this event, a max- 
imum fractional change was observed near L = 2.05; and the effect 
of the storm was sudden, i.e., the flux decrease took place within 24 
hours, while the increase observed in 1962 was gradual and required 
a month to complete. Increases in flux having some of the features 
described here were observed with Explorer 7.2° However, it is dif- 
ficult to be certain that those increases were caused by protons with 
energies above 18 MeV, rather than electrons with energies greater 
than 1.1 MeV. 

The high-energy protons appear very stable over the seven months 
covered by our data. In particular, no effects associated with the 
USSR high-altitude nuclear tests of October 22, October 28, and No- 
vember 1, 1962, or the large magnetic storm of December 18, 1962 have 
been observed. 

In summary, changes through time in the observed values of the 
flux are generally less than 20 percent, although they may be larger 
in some regions of space. We have not been able to detect a diurnal 
effect. Often, secular changes are not separable from other variables, 
an exception being an apparent change in the position of the cutoff. 
An event which appears to comprise a measurable redistribution of 
the proton flux over an appreciable volume of space and period of 
time has been noted. We do not know whether the redistribution is 
in energy or space, and find no indication of the mechanism in the 
data. 


XI. THE CUTOFF 


As discussed in Sections V, 6.3, and 7.5, the cutoff function, x.(L), is 
defined in terms of our instrument, model and fitting procedure. For 
L < 2, the value of x,(Z) corresponds to the position on the given L 
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shell at which the omnidirectional flux is of the order of 1 proton/cm? 
sec, more than three orders of magnitude below the highest flux in 
the belt. However, because the flux is falling so fast with 2, this 
position is almost certainly very close to the place at which the flux 
becomes 0. The last statement is not true for L > 2. Here, although 
the value of 2,(Z) (the place at which y = 0) still corresponds to 
the point at which the limit of sensitivity of our instrument is 
reached, the position of x, is not so well-defined by the fit. In addition, 
one has only to examine Fig. 23 to realize that x, may be significantly 
removed from the value of x at which the flux falls to zero. 

The Model-I HTB coefficients of Table IV define the cutoff func- 
tion, and we have made use of a modification of R. H. Pennington’s 
mirror trace program* to calculate the minimum altitude correspond- 
ing to x,(L) for L < 2.2. This inversion was accomplished using the 
Jensen and Cain magnitude field coefficients for 1960,2° the same set 
used to calculate x and L (see Table I). (Other sets of coefficients are 
available.27 However, using the GSFC (7/65) coefficients?® does not 
produce significantly different altitudes.) 

The minimum altitude is smallest in the Southern Hemisphere over 
the Atlantic Ocean. Fig. 44 shows the results in graphical form. The 
minimum altitude is 270 km near the equator (L = Ly = 1.13), 
decreases to a minimum of ~160 km at L = 1.6, and increases very 
rapidly thereafter. For L less than 1.5, the standard error in altitude, 
derived from the standard error in x, (see Fig. 22), is about 10 km, 
which is roughly the accuracy of the inversion procedure as we used 
it. The standard error in altitude for L > 1.5 is indicated by the 
dashed lines in Fig. 44. At LZ = 2, where the cutoff mechanism is only 
partially atmospheric, the standard error is nearly 50 km. 

The minimum near L = 1.6 in the altitude curve of Fig. 44 appears 
to reflect the existence of the South American magnetic anomaly. 
Although R,(L) [see (5)] increases monotonically with L for L > 1, 
the increase is apparently not fast enough to override the influence of 
the anomaly. This result is true for all the sets of coefficients pro- 
duced in many trial fits as well as for the HTB coefficients in Table 
IV. We have not yet carried out the obvious next step of averaging the 
atmospheric density over the orbital path of the protons to see 
whether or not the shape of Fig. 44 can be explained on the basis of 
present models of the atmosphere. 

Although the shape of the minimum altitude curve remains the 


* Kindly communicated to us by D. J. Williams. 
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same, the value of the altitude is sensitive to the method of select- 
ing the sample (see Section 7.1). For example, the minimum value of 
altitude calculated from the CB coefficients is 100 km (again at 
L = 1.6), 60 km lower than the 160 km calculated from the HTB 
coefficients. The weighting of the HTB sample emphasizes the high 
« data and gives better representation, and therefore a better ex- 
pectation of fitting well, near the cutoff. However, the Telstar® 1 satel- 
lite, with its eccentric orbit and relatively high (950 km) perigee, could 
not give detailed information about particles near the top of the atmos- 
phere, and this is reflected in the results of the analysis. 

In conclusion, the curve of Fig. 44 probably represents the quali- 
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Fig. 44—The minimum altitude reached by > 50 MeV protons as a function 
of L This altitude is determined in geographic coordinates from the transform 
of x.(I.). The dashed curves are + one standard error. 
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tative behavior of the minimum altitude of the cutoff reasonably well, 
but the uncertainty in the value of the altitude is larger than a 
simple examination of the standard error plotted in the figure would 
lead one to believe. The implications of these results for the details 
of the cutoff mechanism have not been examined in detail; however, 
it is clear from the sudden upturn of the curve in Fig. 44 that the 
mechanism is principally atmospheric for LZ less than about 1.9 and 
principally nonatmospheric on higher L shells. 


XII. COMPARISON WITH OTHER WORK 


12.1 Introduction 


When making comparisons among the various high-energy proton 
measurements it is desirable that the results be extensive in time and 
space, reported in terms of omnidirectional fluxes at various positions, 
and that these positions be expressed in magnetic coordinates de- 
rivable from the B,L set. A list of some experiments which meet these 
desiderata is given in Table VII. 

Following a presentation of flux maps, comparisons among these 
experiments are made with respect to the following features: the 
absolute intensity at one point in the belt, as close to the maximum of 
intensity as is practical; the intensity vs L in the equatorial plane; 
the behavior of the intensity on selected LZ shells; the flux near the 
top of the atmosphere, and the equatorial pitch angle distribution. 
Comparisons covering a larger range of proton energies have also 
been made by Vette?? and Fillius.?° 

One of the difficulties encountered in making comparisons among 
the various bodies of data is that most of the results have been pub- 
lished in graphical form, rendering it necessary to scale numerical 
values from small plots, an inaccurate procedure at best. A welcome 
exception is the Explorer 15 data, which McIlwain'® has made avail- 
able by means of a series of interpolation functions in the form of a 
FORTRAN computer program. 


12.2 Telstar® 1 Flux M aps 


For this discussion, the Telstar® 1 HTB results have been converted 
to omnidirectional flux, J, where J = 4ry’/g. (Note that the value of 
g derives from the assumptions of Appendix A regarding the energy 
spectrum.) This procedure provides an estimate of the flux of protons 
with energies between 50 and 130 MeV at positions, (x, Z), in mag- 


Satellite 


Explorer 4 
1958 el 


Injun 1 
1961 02 


1961 aél 
(H2) 


1962 «1 
(H3) 


Telstar® 1 
1962 ael 


Explorer 15 
1962 BAl 


Relay 1 
1962 Bv1 


Injun 3 
1962 Br2 


TaBLE VII—SomME SATELLITE MEASUREMENTS OF THE 
HicH-ENERGY TRAPPED PROTONS. 





Orbit 
Approx. period perigee, R, 
covered in reference apogee, R, Approx. energy 
incl, deg Instrument range 
7/26/58 to 1.041 Anton 302 Geiger > 43 MeV 
* 1.347 tube (shielded) 
50 
7/61 to 1.14 Anton 2138 Geiger > 40 MeV 
12/61 1.16 tube (shielded, 
67 SpB 
10/21/61 to 1.59 scintillator > 59 MeV 
me 1.55 
96 
4/9/62 to 1.54 scintillator > 59 MeV 
*% 1.44 
87 
7/10/62 to 1.15 solid-state 50-130 MeV 
2/21/63 i: .90 detector 
10/27 /62 to 1.049 scintillator 40-110 MeV 
1/27/63 3.72 
18 
5/1/64 to 1.21 scintillator > 35 MeV 
9/22/64 2.14 
48 
12/24/62 to 1.037 seintillator 40-110 MeV 
9/28/63 Z 1.44 
0 


* Not stated. Re-entered atmosphere 10/23/59. 
** Not stated. 
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MAGNETIC 
INVARIANT EQUATOR 





Fig. 45 —— Omnidirectional isoflux contours derived from the HTB coefficients 
and plotted in FR, space. Dashes indicate extrapolation beyond the region in 
which data were acquired. Long dashes form contours of constant percent standard 
deviation. 























Label A B Cc D E L 
Omnidirectional} 5 X 103 2 X 103 1 X 103 5 X 102 2X 10? 1 X 10° protons/ 
cm? sec 


netic space on the basis of the presently provided model and fit to the 
HTB data. 

For ease of reference, Telstar® 1 HTB flux maps are presented in 
three commonly used forms: Fig. 45 shows contours of constant flux 
in R,A coordinates; Fig. 46, contours of constant flux in B,D co- 
ordinates; and Fig. 47, log flux vs log B curves for various values of L. 
These three graphs give an overall picture of the particle distribution. 
In these figures, dashed lines are used to indicate the extrapolation of 
fitted values to regions not penetrated by the satellite. Note the way 
the geometry of the coordinate transformations affects the extrap- 
olated regions. In particular, the functional extrapolation in B,L 
coordinates gives much more curvature to the contours than might 
be anticipated. The difference between the functional and straight 
line extrapolation in B,L can be as large as a factor of 2 in the 
flux (a shift of 0.2 in L) at L = 3. Except for the region of the 
secondary local maximum in the flux near L = 2.2, this functional 
extrapolation compares surprisingly well with the measurements made 
on higher altitude satellites.® 1® 

In the altitude range covered by the data, a single maximum is 
observed. This maximum in the omnidirectional flux of = 6 x 10 
protons/cm® sec is located on the magnetic equator at R = L = 1.46. 
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The intensity falls abruptly near the bottom of the belt (the top of the 
atmosphere) and decreases more gradually toward the sides and top 
of the belt. On a given Z shell, the intensity is a maximum at the 
magnetic equator, and deceases monotonically as the distance from 
the equator increases. 

Neglecting the uncertainties in the calibration of the instrument 
(—25 to +50 percent), which are discussed in Appendix A and are 
mentioned in the next subsection, the estimated standard deviation of 
the estimate of J is less than 2 percent of J over much of the region of 
space discussed in this section. Smoothed contours of 1 percent, 2 
percent and 5 percent standard error are plotted as the dotted 
lines in Fig. 45. Near the cutoff, where the counting rate is falling to 
zero, the standard deviation in x, (see Figs. 44 and 22) is a useful 
indication of uncertainty in the flux. Other information concerning 


B IN GAUSS 
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Fig. 46 — Omnidirectional isoflux contours derived from the HTB coefficients 
and plotted in B,L space. Dashes indicate extrapolation beyond the region in 
which data were acquired. Labeling is given in Fig. 45. 
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Fig. 47 — Omnidirectional flux on several Z-shells derived from the HTB co- 
efficients and plotted against log B. Adjacent curves in parts (a) and (b) are 
slipped one decade in J. All curves rise from J = 1 proton/em? sec. Dashes indi- 
cate extrapolation beyond the region in which data were acquired. 
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standard deviations may be found in Sections 7.4 to 7.6 and 8.4, Figs. 
22, 23(a), 23(c), and 24. 

The equations defining Model II (see Section 4.6) and coefficients 
of Table V, together with the transformation equations among various 
magnetic coordinate systems, allow accurate relative flux values to 
be easily calculated in any coordinate system. 


12.3 Comparison of Absolute Intensities 


The solid curve in Fig. 48 is the fitted omnidirectional equatorial 
flux of 50-130 MeV protons measured by the Telstar® 1 satellite. The 
points are fluxes observed on other satellites (Table VII) at the mag- 
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Fig. 48 — Values of equatorial omnidirectional flux, for the satellites indicated 
in the legend, corrected to the energy range 50-130 MeV and plotted at the ap- 
propriate value of ZL. An integral power-law energy spectrum [see (17)] of ex- 
ponent —M, where M is given is a function of L by the dashed curve, was used 
in making the corrections. References are given in Table VII. 
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netic equator and corrected to 50-130 MeV by using a single-com- 
ponent integral energy spectrum of the form . 


N(>E) « E™. (17) 


The values of M at the magnetic equator are plotted as the dashed 
line in Fig. 48. These values were taken from Gabbe and Brown,? and 
are consistent with those of Brown, Gabbe, and Rosenzweig, and also 
those of Fillius and MelIlwain,** and Freden et al,*° where the data 
overlap. Because of uncertainties in the geometric factors of the de- 
tectors (see Appendix A) and changes in the belt with time (see 
Section X), one might expect agreement only within a factor of about 
2. On this basis the agreement in absolute intensity is quite reason- 


J (Ep) PROTONS/CM2 SEC 
RATIO 


SATELLITE Ep (MeV) 


— TELSTAR®1 50-130 
EXPLORER 4 >43 
INJUN 1 >40 
H2 >59 
H3 >59 
EXPLORER 15 40-110 
RELAY 1 >35 
INJUN 3 





Fig. 49 — Values of equatorial omnidirectional flux, for the satellites and en- 
ergy ranges indicated in the legend, plotted against L. The dashed curve is the 
ratio of Telstar® 1 to Explorer 15 measurements. References are given in Table 
VII. 
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able. However, the J'elstar® measurements are somewhat on the low 
side, and those of Imhof and Smith*? (H2 and Hz on Fig. 48) are 
much higher than the other observations. 

The points represent measurements taken before, after, and during 
the Telstar® 1 experiment so it is unlikely that changes in the flux 
with time explain these differences. It is difficult to account for the 
discrepancies in absolute flux in terms of the spectral correction, 
unless more complex spectral forms than those of Appendix A are 
considered, because the comparisons are among results of detectors 
whose threshold energies are close to 50 MeV. The most likely sources 
of the differences are errors in absolute calibration. It follows that 
a good deal of caution should be exercised in drawing conclusions 
about temporal effects and energy spectra from measurements made 
with different instruments. 


12.4 Intensity vs L in the Equatorial Plane 

Fig. 49 is a plot of the omnidirectional equatorial flux for each of 
the satellites listed in the legend of the figure. The data are from de- 
tectors having several different energy ranges and no spectral cor- 
rections have been made. The general features of the data in these 
energy ranges have been noted previously in the literature. The flux 
increases rapidly with L, goes through a maximum near L = 1.5 and 
then decreases. The decrease is not as rapid as the initial rise and in 
this energy range the flux generally does not decrease monotoni- 
cally?® 2° for L > 2. Excepting the measurements of Imhof and 
Smith,®? the flux decreases with increasing energy, indicating a falling 
energy spectrum. 

The dashed line in Fig. 49 is the ratio of the 50-130 MeV proton flux 
measured with Telstar® 1 to the 40-110 MeV proton flux measured with 
Explorer 15. This ratio is a good qualitative index of the energy spec- 
trum near 45 MeV, and in these circumstances the change in this index 
is independent of the absolute calibrations of the instruments. The 
ratio is seen to decrease monotonically as L goes from 1.25 to 1.9, 
indicating, in agreement with the references cited in the previous sub- 
section, a softer spectrum* at higher L. 


12.5 Intensity vs B on L Shells 
In Fig. 50 [parts (a), (b), and (c)] measurements from various 
satellites are compared on the three L shells, 1.3, 1.5, and 1.8. The 


* A softer spectrum contains a larger fraction of low-energy particles. 
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Fig. 50 — Values of omnidirectional flux on various specified L~shells, for the 
satellites and energy ranges indicated in the legend, are plotted vs B, in parts 
(a), (b), and (c). Ratios of Telstar® 1 to Explorer 15 measurements are shown 
in part (d). References are given in Table VII. 
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Explorer 15 and Injun 3 measurements have been compared in more 
detail by Valerio? Observe that J decreases monotonically with B 
on all the LZ shells and the shape of J vs B is very similar for all the 
measurements on the same shell except for the lowest L shell where 
the dependence on the energy response of the detector is most im- 
portant. Information concerning the energy spectrum near 45 MeV 
is contained in the changes in the ratios of the measurements, and in 
these circumstances the changes are independent of the absolute 
calibrations of the instrument. 

To cast more light on the qualitative behavior of the energy spec- 
trum, the ratio of the 50-1380 MeV proton flux measured with the 
Telstar® 1 satellite to the 40-110 MeV proton flux measured with Ex- 
plorer 15 has been calculated as a function of B for fixed L. The results 
are plotted in Fig. 50(d). All the ratios increase with increasing B for 
L from 1.2 to 1.9 inclusive. The values of B in the plot cover the range 
from the magnetic equator to a magnetic dipole latitude (A) of about 
30°. The increase in the ratio indicates a spectrum that hardens with in- 
creasing B in the neighborhood of 45 MeV. At ZL = 1.8 Freden et al*®® 
find a spectrum that hardens with increasing B for proton energies 
between 10 and 35 MeV, but softens with increasing B for proton 
energies above about 55 MeV. Our results suggest that this change in 
behavior cannot have occurred below 50 MeV. 


12.6 The Intensity Near the Top of the Atmsophere 


The position of the 8-protons/cm? sec flux contour from the T'elstar® 1 
satellite is plotted in B,D coordinates in Fig. 51 (a), together with our 
own extrapolation of the published Injun 3 data?® to a flux of about 10 
protons/cem? sec,* and the 16-proton/cm? sec flux contour from Explorer 
4, The purpose of this figure is to test whether or not the altitude 
dependence of contours of constant counting rate at low altitudes is 
consistent with other data. The qualitative agreement of the results 
plotted in Fig. 51(a) is quite good, especially for L < 1.8, where the 
atmosphere is controlling. A number of effects may contribute to the 
divergence of the results for L > 1.8. Among them are: temporal ef- 
fects, this region of the belt is shown to be subject to temporal varia- 
tions in Section X; instrumental effects, the instruments are near their 
threshold sensitivities in a region of magnetic space in which the en- 
ergy spectrum may be anomalous; and biases in the fitting procedure, 


* Valerio!® states that his fits (and therefore his Fig. 8) are not intended to 
represent the data accurately at low altitude. 
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examination of residuals give some indication of a slight bias in the 
fitted function in this region. 

It is difficult to get direct insight into the altitude dependence 
from a B,L plot, so the values of B have been transformed into mini- 
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Fig. 51— Comparison of isoflux contours obtained from three satellites near 
the top of the atmosphere. Part (a) B, L coordinates, part (b) minimum altitude 
(near the South American magnetic anomoly). References are given in Table VII. 
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mum altitude by using the procedure mentioned in Section XI. The 
minimum altitudes are plotted against L in Fig. 51(b). It is character- 
istic of all three bodies of data that the minimum in the minimum 
altitude curve does not occur at minimum L. 

It is tempting to consider whether the lower altitude of the Explorer 
4 points, coupled with the lower low-energy threshold and high flux asso- 
ciated with the Explorer 4 measurements, might imply that the exo- 
sphere was less dense when the Explorer 4 measurements were made. 
However, the uncertainty in the position of the Telstar® contour (see 
Section XI) is so large that the use of this figure to refute the hy- 
pothesis that the atmosphere contracted** between 1958 (= solar 
maximum), when the Explorer 4 measurements were made, and 1962, 
when the Telstar® data were taken is precluded, even if one were pre- 
pared to overlook the possibility that the energy spectrum at these low 
altitudes is anomalous*®* and consequently that the calculated geometric 
factors of the instruments may be in substantial error near the cutoff. 


12.7 Equatorial Pitch Angle Distribution 


The solid curves in Fig. 52(a) represent the equatorial pitch angle 
distributions, at various values of L, calculated from (8) and the co- 
efficients in Table V. When these are compared with the equatorial 
pitch angle distributions obtained from the Injun 3 data,?® which have 
been replotted as the dashed curves in Fig. 52(a), they are found to 
be very similar in shape, although the Telstar® curves are a trifle 
flatter. This would be anticipated from the previous discussion of the 
tendency of the energy spectrum of protons with energies near 45 MeV 
to harden at high values of B. The shape of the distributions are, how- 
ever, appreciably different from those derived by Lenchek and Singer*’ 
from consideration of possible injection and loss mechanisms. This - 
may be seen in Fig. 52(b) which contains the present results as the 
solid lines, and the results of Lenchek and Singer®’ as the dashed lines. 


12.8 Other Bodtes of Data 


A portion of the considerable body of relevent high-energy proton 
data, some of which does not meet the requirements for inclusion in 
Table VII, is noted here. The earliest measurements of proton intensi- 
ties were made on Explorers 1 and 3 by Van Allen.** His historic esti- 
mate of = 2 X 10* protons/em? sec with energies >40 MeV at the heart 
of the inner belt (x = 0, L = 1.56) has been substantiated by all the 
measurements reported to date. In particular, the high-energy proton 
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Fig. 52 — Unidirectional flux vs x on various L-shells. The solid curves are de- 
rived from the HTB coefficients using (8). The dashed curves in part (a) are 
Injun 3 results (from Valerio!® Fig. 8). The dashed curves in part (b) are the 
results of the theoretical calculations of Lenchek and Singer,?7 taken from their 
Fig. 10 and arbitrarily normalized to reasonable values of 7 at z = 0. 
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measurements made in the inner belt by Explorers 6, 12, and 14 and 
Pioneers 3 and 4, have been noted by Frank et al*® to agree with each 
other and with those on Explorers 1 and 3. Reference to some measure- 
ments made with ballistic probes may be found in the article by Freden 
et al.35 


XIII. QUO VADIS 


The mathematical model which has evolved along the lines summa- 
rized in Section IV has provided a very satisfactory representation of 
the high-energy proton data from the Telstar® 1 satellite, as discussed 
in both statistical and physical terms in Sections VI through XII. It is 
appropriate to consider how this work might be extended. 


13.1 Further Improvements within the Present Scheme 

The final fit of Model II has a mean square error which is less than 
twice the variance to be expected on the assumption of a Poisson dis- 
tribution of the count data. Some of this excess is surely due to “ex- 
perimental error.”” However, one might seek some additional improve- 
ment by the addition of more parameters to the fitting function as 
indicated in Model III of Section 4.7. Such fits, carried out on an 
approximately 1000-point selected data set, will almost surely lead to 
a reduction in the mean square residuals because of the increased free- 
dom the additional parameters provide. However, as noted in Section 
4.7, preliminary work with Model III has not led to a really substan- 
tial improvement, either statistically or aesthetically as judged by plots 
of the residuals. 

Additionally, one might try to improve further on the representative- 
ness of the sample by simple iteration. Using the HTB fit to Model II 
to determine new 2, cells, another sample might be selected and fitted. 
The very small differences between the Model-I CB fit and the Model-I 
(or II) HTB fit do not suggest that this would be fruitful in the present 
case. If the preliminary fit used for determining the x boundaries of 
the cells were a poorer fit, iteration would clearly be worthwhile. 

A further extension of the procedure for designating representative 
cells would involve the development of a two-dimensional version of 
the basic idea and procedure outlined in Section 7.1. Specifically, one 
would try to define approximately 1000 2,Z cells within each of which 
the preliminary fit to y(z, L) has the same range. In the present case, 
the anticipated gain from this refinement did not seem to justify the 
practical difficulties. However, a practical, well-defined algorithm for 
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such a process in several dimensions simultaneously might prove very 
useful. 


13.2 Another Approach to the Model 


All the models presented so far are of the form 


y(, L) = A(L)-b@; e,(Z)), (18) 
where A(L) represents the variation in intensity along the magnetic 
equator and b(x; e;(1)) represents the variation with x on an L-shell. 
‘The {e;(Z)} adjust the nature of the dependence on z, as a function 
of L. This approach arises from the L-shell orientation of the adiabatic 
theory of trapped particle motion. 

Alternatively, one might focus attention on the shape of y as a func- 
tion of L at constant x, rather than on y as a function of x at constant 
L. It is shown in Fig. 19(a) and discussed in Section 7.2 that y(x, L) 
as a function of Z for fixed x forms a simple nesting set of curves at 
successive values of x. This is a consequence of the monotonic decrease 
of y with x at any fixed L. With this orientation, a model might be 
expressed as: 


where F(L) is the shape of a constant-x section, whose parameters, 
the {pi}, are expressed as functions of x. Although this approach would 
not contain the L-shell orientation of the particle motion explicitly, it 
seems to offer very significant practical possibilities. 


13.3 Full Data Utilization 


In the two-dimensional fits that were carried out, only a selected set 
of data were used, either chosen at random within a set of narrow, 
contiguous L-slices, as in the fit of Section VI, or chosen on the basis 
of a preliminary fit to the data as in Section VII. All the data were 
examined by residual plots and mean square residual measures of the 
fits, but only a small part of the data were actually used in determin- 
ing the values of the fitting parameters. With this procedure, informa- 
tion is clearly being lost that could be used to “better” determine the 
function. 

Several methods have been applied in the past to allow all of an 
existing body of satellite data to influence the mathematical descrip- 
tion of that data. The most direct method uses interpolation or smooth- 
ing functions. It is often the case that consecutive satellite observa- 
tions from a particular detector are closely enough spaced to determine 
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the local spatial variation. Under these conditions a sequence of data 
points can be averaged or fitted to a local smoothing function. A num- 
ber of points in a sequence may thus be replaced and represented by a 
single point which is determined by them all. The replacement may 
also be made at some particularly convenient coordinate location, for 
example, at one of a fixed set of L or x values on which functional 
fitting may subsequently be carried out. This method has been used 
on the data of Explorer 15, portions of which have been described by 
MelIlwain,1* Roberts*® and Brown.*! 

In the context of the high-energy proton data from the Telstar® 1 
satellite, a different but analogous procedure could be used. Rather 
than selecting at random one data point within each of approximately 
1000 x,Z cells, all points within a given cell could be used to determine 
a value which would represent the observable at the central point of 
the cell. This might be done by simply averaging the points within the 
cell, but the cell size is large enough so that the x and L dependence 
within the cell generally cannot be neglected. A more representative 
procedure would be to fit the points within an 2,L cell with a local 
smoothing function. This function can be the same function with which 
the finally selected data values would be fitted across the complete 
range of x,L space (see Appendix B.7). Although in the present case 
the average number of points per cell is about 40, in many cells the 
number of points is fewer than the number of coefficients of the Model 
II function, and some coefficient constraint would be required. This is 
not a substantial objection, however, since the function is only being 
used for smoothing and does not need to be capable of elaborate varia- 
tion over an 2,L cell. 

A procedure of this kind greatly reduces the chance that members 
of a final 1000-point set will be nonrepresentative and acknowledges 
the experimental weight of adjacent observations in fixing the values 
of the set. Accordingly, one would expect a reduction in the mean 
square residuals overall the data, from a fit to such a smoothed sample. 

The procedure of smoothing within a cell could be used with larger 
x, L cells (with more points per cell) to define a point set smaller than 
1000. It can of course also be used with much larger bodies of data 
up to a maximum of 1000 points per cell with the existing computer 
program. 


13.4 Extension to Other Cases 
There are very evident values in being able to communicate the 
essence of a large body of data in terms of a mathematical model with 
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a small number of coefficients. This is very effectively accomplished by 
the present empirical representation of the Telstar® 1 high-energy 
protons, but the model is very specialized. As previously noted, includ- 
ing a wider range of space such as that explored by the Telstar® 2 
satellite requires modification of the function. Characterizing the pro- 
ton distribution for substantially lower energy protons may well re- 
quire functions outside the generality of even Model III. Treating 
electrons in almost any region of space requires treating time as well 
as position variables because a complete set of measurements of the 
spatial distribution of the particles cannot readily be obtained in a 
time short compared with significant time variations. 

No single formulation yet exists which is capable of coping in a use- 
ful way with the range of measurements of particles trapped in the 
magnetic field of the earth. However, the success of the present formu- 
lation as it has been evolved and the general methods that have been 
developed gives us confidence that other and more complicated cases 
can be treated. 


XIV. SUMMARY AND CONCLUSIONS 
This section provides a summary, with references, for the entire 
document including the appendices. 


14.1 General Accomplishment 

The main accomplishment is the development of a relatively simple 
(empirical) mathematical model which gives a statistically accurate 
representation of the spatial distribution of high-energy protons meas- 
ured with the Telstar® 1 satellite. 


14.2 The Data 


14.2.1 Space and Time Coverage (Sections I and IT) 

The data were acquired between July 1962 and February 1963 within 
the region of space bounded by 1.09 Rk, S$ Rh £1.95 R, andO S 2d S 58°. 
Inside these boundaries good temporal and spatial coverage were 
achieved. 


14.2.2 Hnergy Range and Instrumental Sensitivity (Appendix A) 
The nominal energy interval of the detector is 50 < HE < 130 MeV 


and its nominal geometric factor is 0.143*9-324 cm? ster. The in- 
strument is effectively omnidirectional and the lower threshold of 


sensitivity is 1 proton/cm’ sec. 
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14.2.3 Telemetry (Section IT) 


Each observation consisted of the number of counts registered in 11 
seconds. With this was associated the time at which the telemetry was 
received, and auxilliary information. 


14.3 The Models 


14.3.1 Coordinate System (Section ITI) 


Each model relates the omnidirectional intensity of high-energy pro- 
tons to a two-dimensional magnetic space whose coordinates, x,L, de- 
rive from a mapping of the earth’s main magnetic field onto an axially 
symmetric dipole field through the adiabatic invariants of the particle 
motion. 


14.3.2 General Form and Properties (Section IV) 


The models have the form of a product, A(L)-G(z,L), in which 
the first term expresses the equatorial intensity as a function of L, and 
the second term describes the diminishment of intensity, as a function 
of increasing x, for fixed L. The functional expressions for G (exclud- 
ing G’’) transform in closed form to equivalent pitch angle distribu- 
tions, 


14.3.3 Specializations (Sections IV and IX) 


Retrospectively, all the models may be considered to be specializa- 
tions of Model III, but historically the two-dimensional models evolved 
from a series of one-dimensional fits on L-slices. These fits led to the 
L-slice model which was then generalized empirically to the two-di- 
mensional Model I. Model I was in turn specialized to Model II to 
overcome some statistical (nonlinearities and high correlations) and 
interpretive difficulties encountered with Model I. 


14.4 Fetteng 


14.4.1 Criterion (Section III and Appendix B) 

The least squares criterion was used in deriving estimates of the 8 
(or 9 or 10) coefficients required by the models to fit the data. 

14.4.2 Scale (Section IIJ and Appendix B) 


To stabilize the variance of the observations, the models have been 
fitted to the square root of the observed counting rate. 
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14.4.3 Sampling (Sections 6.1, 6.9, 7.1 and Appendix B.3) 

Coefficients of Models I and II were estimated by fitting samples 
containing about 1000 of the nearly 80,000 available observations. 
Sampling is necessary to avoid exaggerating the importance of regions 
of x,L space where data are abundant, and also for compatability with 
existing computer programs. A method of sample selection based on a 
preliminary fit has been developed to provide a good overall represen- 
tation of the data. Before selecting the sample, the data were parti- 
tioned to remove instrumental effects and outliers identified by study- 
ing residuals from preliminary fits. 


14.5 Quality of Fit 


14.5.1 Criteria of Judgment (Sections VI to IX and Appendices B 
and C) 

Judgments regarding the quality of fit were largely based on graph- 
ical studies of residuals, the behavior of the fit at the boundaries of 
the radiation belt and various statistical measures. Residuals (equal 
to observed minus fitted), on the square root scale, were particularly 
useful as sensitive indicators of the quality and nature of the fit. 


14.5.2 Comparisons Among Models (Sections V and IX) 

The L-slice fits give good one-dimensional representations of very 
limited regions of data. Both the standard errors of the coefficients and 
the correlations among coefficients are high compared to the corre- 
sponding measures derived from the two-dimensional fits. The fits of 
Models I and II to the 960-point HTB sample are practically equiva- 
lent. However, Model II is superior in the following respects: one less 
coefficient is required, standard errors are uniformly smaller, correla- 
tions among the coefficients are uniformly smaller, the index of non- 
linearity is very much smaller, and more of its coefficients have a phys- 
ical meaning. 


14.5.3 Coordinates (Sections VI and VII) 
Plots of residuals vs x, L, time, etc. indicate the general adequacy 
of x,L coordinates for the organization of the data. 


14.5.4 Quantitative Measures (Sections VII, VIII, IX, and Appendices 
B and C) 

Typically, the fits account for nearly 99 percent of the variability 
about the data mean. The mean square error of fit is about 14 times 
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as large as would be anticipated on the basis of assumed Poisson sta- 
tistics. Even in the worst of quite small spatial regions, the mean 
square residual does not exceed 24 times the Poisson-based prediction. 
Probability plotting procedures indicate that the residuals are closely 
normally distributed and lead to an estimate of the variance which is 
about twice the Poisson-based prediction. 


14.5.5 General Limitations (Appendix C) 

Statistical examination of all the data, categorized in 2,D cells de- 
fined from a preliminary fit, indicates that it is unlikely that the fit 
given by the present model could be significantly improved by any 
simple modification based on x, coordinates alone. 


14.6 Numerical Values of Fitted Coefficients, Standard Errors, etc. 


14.6.1 L-Slices (Section V) 

Coefficient values and other statistics for four L-slices appear in 
Table II, and values of coefficients for a large number of L-slices are 
shown in Figs. 8 to 10. 


14.6.2 Models I and II (Sections VI to IX, also Sections V, XI, and 
XII) 

Model II is the preferred model. Coefficients, standard errors, cor- 
relations, and other summary analysis-of-variance statistics appear in 
Table IV for Model I and Table V for Model II. The coefficient func- 
tions: (z) square root of average counting rate, y(z,L); (i) square 
root of average equatorial counting rate, A(L); and (wz) position of 
cutoff, x.(L); are well-determined and applicable values, standard 
errors, and correlations appear in Figs. 19 and 24 for y(z, L) (and 
Figs. 45 to 47 for the flux); Figs. 8, 11, 21, 22, and 30 for A(L); and 
Figs. 9, 12, 22, 23, and 30 for x. (L) (and Fig. 44 for altitude). 


14.7 Some Physical Results 


14.7.1 Flux Maps (Section XII) 


Flux maps are given in B,Z and R,) coordinates and as J,B contours 
for constant DL, based on the fitted model and using a calibration of the 
detector assuming certain single-component energy spectra. Neglecting 
uncertainties of calibration, the relative fluxes have a standard error of 
about 2 percent. The value of the maximum flux is (5.7753) X 
10° protons/cm’ sec at L = 1.46 on the magnetic equator. 
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14.7.2 The Cutoff (Section XI) 

The minimum geographical altitude corresponding to the fitted cut- 
off function was computed. This altitude varies as a function of L and 
has a value of about 270 km at the magnetic equator at L = Ip = 1.13 
and a minimum of about 160 km at L = 1.6. The shape of this DL de- 
pendence suggests that the interaction between the protons and the 
residual atmosphere is of major importance in determining the cutoff 
for values of Z less than 1.9. For larger LZ values, the loss mechanism 
determining the cutoff is of different origin. 


14.7.3 Temporal Effects (Section X) 

The general spatial distribution of high-energy protons is very stable 
in time over the period covered by the present data; however, using 
residuals as a sensitive indicator, we find two temporal effects that are 
distinguishable from instrumental effects. Firstly, there appears to be 
an increase in the flux in the 1.9 < L < 2.2 region during the 30-day 
period starting about day 280, 1962. This increase varied from about 
5 to 90 percent depending on both x and L. Secondly, there is an indica- 
tion of a qualitative increase in the altitude of the cutoff over the pe- 
riod of the observations. The present results indicate that any diurnal 
variability of the earth’s magnetic field would have an upper limit of 
0.003 Gauss at L = 1.5. 


14.7.4 Comparison with Other Experiments and Theory (Section XII) 

The absolute fluxes measured in this experiment agree well (within 
a factor of two) with other extensive experimental measurements, but 
the present values are in general slightly lower. Spatial distribution of 
the flux agrees very well with other measurements but differs appreci- 
ably from published theoretical calculations. 


14.8 Extensions (Sections XIII, IV, and Appendix B) 

The methods developed in this work have lead to a very satisfactory 
representation of the high-energy proton data from the Telstar® 1 
satellite. 

With the better methods of utilizing data and selecting samples 
noted in this paper, and with more general functional forms (some 
approaches to which have been indicated), it should be possible to rep- 
resent the radiation intensity for other more extensive and less ‘‘well- 
behaved” bodies of data than the one treated here. Most aspects of 
the statistical methods developed are generally applicable to problems 
of modeling data mathematically. 
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APPENDIX A. 


The Instrument 


Energetic electrons and protons were measured on the Telstar® 1 
satellite by a group of detectors in all of which the sensitive element 
was a phosphorous-diffused silicon diode specially developed for such 
particle measurements.” The active volume of the device is the disk- 
shaped space-charge region of the diode under reverse bias. For the 
detector measuring protons with energies above 50 MeV, the reverse 
bias was approximately 100 volts, the space-charge region was approx- 
imately 2.8 mm in diameter and 0.39 mm thick, and the diode was 
shielded by about 12 mm of aluminum over a solid angle of 27 and 
somewhat more than 12 mm of aluminum equivalent over the remain- 
ing hemisphere (see Fig. 53). 

The thickness of the space-charge region of the detector was meas- 
ured with protons from a cyclotron. A calculation of the path-length 
distribution for unscattered particles in the space-charge region and in 
the surrounding shielding materials has been made. These calculated 
results have been combined with range-energy information, and the 
properties of the associated electronic circuits, to give the geometric 
factor of the instrument, g(Z), as a function of the energy, H, of pro- 
tons incident on the spacecraft. The geometric factor varies with the 
reverse bias voltage and the temperature of the detector, both of which 
affect the effective thickness of the active volume of the diode. Fig. 54 
is a graph of g(Z) vs E for a bias voltage of —97.5 volts and a temp- 
erature of 20°C, the nominal operating conditions of the instrument. 
Note that protons with energies below 50 MeV were not detected. 

The geometry of the detector and shield is only approximately omni- 
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Fig. 53 — The instrument. 


directional. However, the satellite was spin stabilized, the symmetry 
axis of the detector was nearly perpendicular to the spin axis of the 
satellite, and the telemetered counting rate was an average over at 
least 15 revolutions of the satellite. This averaging process tends to 
remove any directionality inherent in the detector geometry. A sensi- 
tive analysis noted in Section 7.10 failed to show any directional de- 
pendence in the data. 
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Vig. 54 — Dependence of geometric factor on energy of protons incident on the 
shielding. 
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For a differential energy spectrum N (/), where VN (Z)dE is the number 
of protons with energies between / and EL + dE, the average geometric 
factor, g(/,, #.), of the detector for particles with energies between 
FE, and E, is defined by 


[ sev@ az 


N(E£) dE 


Ex 


VE,, E2) = (20) 


The function g(50 MeV, E,) has been evaluated numerically for various 
values of #, and forms of N(/). The values of g(50 MeV, 130 MeV) 
are plotted in Fig. 55 as a function of n for the single-component power- 
law spectrum N(E£) « HE”, and also as a function of E for the single- 
component exponential spectrum N(#) « exp(—H/E,). It may be 
seen from the figure that g(50 MeV, 180 MeV) varies by less than 
6 percent from 0.143 cm’ ster for 0 < » < 7.5 and 10 MeV < EL, < 
90 MeV. These ranges of n and J, include most experimentally de- 
termined values by a comfortable margin.*’*'”’’***° The omnidirectional 
flux, J(Z,, E.), of protons with energies between /, and FE, is given by 


4r Y? 


JE Bs) = Tay? 


(21) 


where Y’ is the counting rate of the detector. In the body of this paper, 
the values LH, = 50 MeV, F£, = 130 MeV and 
g = 9(50 MeV, 130 MeV) = 0.143 cm’ ster (22) 


are used. The flux J(50 MeV, 180 MeV) is designated simply by J, 
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Fig. 55 — Dependence of average geometric factor on the exponent of a dif- 
ferential power-law energy spectrum and the e-folding energy of an exponential 
energy spectrum. 
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and the counting rate to flux conversion is considered to be independent 
of the proton energy spectrum. 

While the relative value of g shows a variation of less than 6 percent 
for the wide range of single-component energy spectra noted above, 
the absolute value of g is less well specified. Variations in the ambient 
temperature and reverse bias voltage may change the effective geo- 
metric factor by asmuch as 25 percent. The difficulty of dealing with 
the complexities in shielding geometry, caused by embedding the 
instrument in the spacecraft, introduces additional uncertainties in 
the absolute value of g. These uncertainties are in the range of —25 
to +50 percent. 

No provision was made for recalibrating the detector once the 
satellite was in orbit. However, the evidence, which is discussed in 
Section X, concerning the temporal variations of the proton distribution 
is that neither the detector nor the associated circuit elements were 
substantially affected by the space environment. Instrumental (e.g., 
temperature and bias voltage) effects are often quite different in char- 
acter from temporal changes in the proton belts and may be separated 
from them in many circumstances. It is, of course, possible to postulate 
instrumental effects that will be inextricably confounded with certain 
secular changes that might take place in the proton distribution. 


APPENDIX B. 
Some Statistical Details 
B.1 Introduction 


This appendix presents, heuristically, some facts and formulae con- 
cerning the statistical analysis of the data. While a variety of statis- 
tical principles, precepts and procedures were employed as guides, the 
main judgments came from empiricism, scientific intuition and com- 
mon sense. Various kinds of plots of residuals, used informally, have 
been of key importance, both for evaluation and for suggestion. 

Simply stated, the objective was to produce a statistically accurate 
analytical description of the intensity distribution of high-energy pro- 
tons In space surrounding the earth. The process of analysis involved 
the empirical evolution of a mathematical model, in interaction with 
the application of fitting and evaluative techniques. The data source 
and processing have been described in Sections II and III. The itera- 
tive and interactive processes of the final stages of model development, 
fitting, data partitioning and data sampling are described in Sections 
IV to IX. 
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Appendix B.2 deals with the basis for use of the square root trans- 
formation, Y, of the counting rate data, Y?. Appendix B.3 discusses the 
selection of a sub-sample used in fitting. The use of the method of least 
squares in nonlinear model fitting, to estimate unknown coefficients, 
or functions of the coefficients, and their standard errors and correla- 
tions is reviewed briefly in Appendix B.4. Some remarks on construc- 
tion of sums of squares contours, often referred to as confidence re- 
gions, and of indices of local nonlinearity of the model are given in 
Appendix B.5. Appendix B.6 discusses several issues relevant to the 
interpretation of the analysis of variance results. Appendix B.7 de- 
scribes a mode of “smoothing” data within cells, which could have 
been used in conjunction with the sub-sampling procedure. Appendix 
B.8 concerns the technique of probability plotting. 


B.2 The Square Root Transformation 


It appears a reasonable assumption (supported by some empirical 
evidence) that, in the absence of geophysical disturbances, at a fixed 
point in space relative to the earth, the number of counts Z, recorded 
in the detector in 11 seconds, will vary in time according to a Poisson 
distribution, i.e., 

e’y" 


Probability {Z = z} = aA 





) 2=0,1,2,3,---, (23) 


where the parameter of the distribution, v, is the mean value of Z. 

With this statistical model, the average intensity of radiation in the 
region of space measured by the detector is proportional to v, where 
the proportionality factor depends on the counter geometry and effi- 
ciency. The objective is to develop a function which describes how » 
varies in space, based on observations of the quantity Z at different 
positions in the satellite orbit. 

For the Poisson distribution, the variance of Z is also y, 1e., the 
average of the squared deviations, (Z — v)?, is v. Thus, as the value of 
v changes, the variance of the associated random variable Z also 
changes. Hence, the scatter of Z about its average value will be differ- 
ent in different regions of space as the average intensity fluctuates. 

Working with the experimental data on the scale of Z has two draw- 
backs. Firstly, if one fitted a mathematical model to the data using a 
least squares criterion, the different observations would have variable 
weight, which would require appropriate, troublesome, allowance in 
the fitting procedure. Secondly, graphical judgment of the adequacy of 
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any particular fit would: be difficult because of the variable scatter of 
the data about a fitted function in different regions. 

Thus, the square root transformation, Y, of the counting rate 
(Y? = Z/11 counts per sec) was used to “stabilize” the variance and 
the model-fitting procedure employed unweighted nonlinear least 
' squares on Y (but with some data weighting as discussed in Section 
7.1 and Appendix B.3). 

Heuristically, consider the linear Taylor’s expansion of Z about 


= . Z—?) 
VZ=vVo+ 45... 24 
aa leg 6 (24) 
Then, the variance of Z is approximately 
Var (WZ) = (+ 1} Var (Z —v) + - (25) 
Vy 
If 
Var (Z — v) & », a " (26) 
then 
Vin(a/Z) & == (27) 
4y 4’ | 


that is, Var (~/Z) would be approximately a constant. 

Discussions of this transformation are given by Bartlett’? and 
Anscombe.” If the distribution is in fact Poisson, then Anscombe shows 
that the average value of W/Z is approximately 


- 1 rf 
oe 8—V/y ~ 128,32? 


while the variance of ~/Z is, asymptotically, 


142 ee +. 


Again for the Poisson distribution, Bartlett gives exact values of the 
dependence of the variance of ~/Z on », summarized in the following: 
y:i0}05 | 112 | 3 | 4/16/19 | 15 
Var VZ :|0|0.310|0.402|0.390|0.340/0.306|0.276|0.263|0.256 


For a Poisson distribution, a transformation of the form WZ + 1/2 
or VZ+ 3/8 or (VZ+ Vz + 1— 1) will improve the variance 
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stabilization at smaller » values. In the present application, such a 
modification would have appeared physically: artificial and incon- 
venient. Moreover, the actual variance of the observations exceeds 
the Poisson variance (see Appendix C) and the “‘correction”’ was thus 
felt to be unwarranted. Some response to the (empirically defined) 
variance instability remaining after the square root transformation 
was made in the form of some weighting in the data selection (described 
in Section 7.1 and Appendix B.38). 

Of course, if one wished to adopt the assumption of a Poisson dis- 
tribution as an absolute basis for procedure, instead of as a guide, then 
one might choose to use maximum likelihood to estimate the coeffi- 
cients of the model. This would mean developing a procedure for de- 
termining values of the coefficients [of the function v(x,Z)] which 
would maximize 


oF u(y LF /zt. 


observations 


In the present case, a general program for nonlinear least squares was 
available while a procedure for Poisson likelihood maximization would 
need to be evolved. Apart from this practical consideration, however, 
it seemed more robust to use the Poisson assumption as a guide to 
developing an appropriate transformation preliminary to fitting by 
least squares. The point is that the square root transformation will 
effect an approximate variance stabilization not only when the variance 
is equal to the mean (as in the case of the Poisson distribution) but 
also when, more generally, the variance is proportional to the mean. 
Empirical vindication of this caution is given in Appendix C. More- 
over, the least squares approach enables the approximate statistical 
interpretation of results using familiar procedures from linear multiple 
regression methods. 

The present analysis is based on the quantity Y, where Y? = count- 
ing rate = Z/11 counts per sec. Thus, if in fact Z were a Poisson vari- 
able, 


Y) = a)(4) = 

Vary) = (4 4 0.023, (28) 
as a reasonable approximation. When the average counting rate ex- 
ceeds 1/11, this value of 0.023 is a lower bound on the variance of Y, 
even with the Poisson assumption. Moreover, there are many other 
possible sources of intrinsic variability and experimental error in this 
situation. 
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A further benefit which one might expect from the square root trans- 
formation in this circumstance is that the distribution of residuals 
would tend to be more symmetric and more nearly normal (Gaussian). 

Some empirical properties of this square root transformation in the 
present body of data are given in Appendix C. 


B.3 Sample Selection 


As a practical requirement, the available multivariable, multicoeffi- 
cient, nonlinear least squares fitting program could operate with a 
maximum of 1000 data points. Hence, the 41,135 HTB observations 
needed to be sampled or condensed at a 1 in 40 ratio. 

As in all real sampling or experimental design situations, many com- 
peting criteria and practical difficulties were relevant. Perhaps the 
overriding point, explicitly understood here (and probably true in most 
actual model fitting problems), is that the model which was being de- 
veloped was not the “truth” but was really just a smoothing function 
which one wanted to fit well over a wide region of space. Thus, it was 
not appropriate to think of estimating the model coefficients, say, so 
as to optimize their apparent (indicated) statistical reliability, nor 
would it be appropriate to try to use all the available data in an 
equally weighted manner, since accidents of orbital position and in- 
strumental behavior would have too great an effect on the distribution 
of data points. 

The procedure developed for the present use is outlined in Section 
7.1, with pertinent remarks also in Section 13.3 and Appendix B.7. 

The method of Section 7.1 yielded 960 observations to which the 
model was then fitted using unweighted least squares. The 960 sampled 
observations were selected so as to be roughly speaking, “widely 
spaced,” the metric being change in average counting rate. Thus, the 
challenge of fitting the 960-point sample, as measured by sum of 
squares of residuals, is greater, on a per-observation basis, then would 
be that of fitting the entire body of 41,135 HTB observations, very 
many of which are quite close together. The “model bias” difficulties of 
the entire body of data are concentrated in the sample. The statistical 
fluctuation would be approximately the same, on a per observation 
basis, in the sample as in the whole body of data. 


B.4 Estimation Procedure 


The unspecified coefficients of the models defined in Section IV were 
estimated so as to minimize the sum of squares of deviations between 
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the observed Y and fitted y, for the sample array of data. The itera- 
tive, multivariable, multicoefficient, nonlinear least squares fitting 
was executed using a computer program due to Huyett and Wilk,** 
based on a procedure outlined by Wilk* (see also Lundberg, Wilk and 
Huyett) 4% 4 

The classical statistical properties of least squares estimation, 
namely unbiased estimates with minimum variance, apply in the case 
of statistically uncorrelated observations having equal variances and 
with the coefficients to be estimated occurring linearly in the model 
(see, for example, Wilks'*). In the present case, even with the square 
root transformation, the observations do not have equal variances but, 
for practical purposes, the weighting implied by the selection proced- 
ure (see Section 7.1) compensates adequately. The model is, however, 
quite nonlinear in the coefficients. Still, one hopes that the attractive 
statistical properties of linear least squares carry over approximately 
to the nonlinear case because, in small enough neighborhoods, non- 
linear functions can be linearly approximated. (An index for measur- 
ing model nonlinearity is described in Appendix B.5.) In any case, 
the least squares criterion is geometrically appealing and primitively 
meaningful. 

Among the by-products of the fitting procedure, applied to the par- 
ticular array of data in 2, space, are approximate values for the 
standard errors of the estimated coefficients, a matrix of approximate 
pairwise correlation coefficients for the estimated coefficients, an anal- 
ysis-of-variance table giving the sum of squares accounted for and 
not accounted for by the fitted model, a list of residuals (equal to 
observed minus fitted) , and various plots. 

The least squares estimates of single-valued functions of the coeffi- 
cients, such as A(L), x.(Z), or y(x, L) are simply the same functions 
of the estimates of the coefficients (since least squares is an invariant 
process). Approximate variances and correlations of functions of the 
coefficients may be derived as follows: If 6’ = (6,, --: , 6,) denotes 
the coefficients of the model, and 6 their estimates, then the approximate 
covariance of the estimates g(6) and h(6) of the functions g(@) and 
h(@) is 


Covariance (g(6), h(6) 


Cov (9(8), h(6)) 


Statistical average of {(g(6) — g(0))(h(6) — h())} 
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t<7j 
where p;; is the correlation of 6; and 6;. The formula for the approx- 
imate variance of g(6) is then just a specialization of the above, putting 
g=h. 

Some associated facts and issues are worth mentioning here. First, 
the approximate statistical correlations p;; of the estimated coeffi- 
cients of the model, or of functions of these, depend on (72) the distribu- 
tion of the sample in 2,L space, (77) the values of the coefficients and 
(i2z) the nature of the mathematical model; but do not depend on the 
actual adequacy or appropriateness of the fit. Similarly, the approximate 
standard errors of estimates are each made up as a product of which 
one term depends upon the square root of the mean square of the 
residuals of fit and the other depends only on the same factors as do 
the p;;. Second, the various statistical measures, such as standard 
errors of estimated coefficients which are obtained from the fit to the 
960-point HTB sample are, in a narrow statistical sense, conservative 
because they refer to the sample only and do not make allowance for 
the fact that the fitted model does indeed fit very well to the entire 
body of 41,185 HTB data. Thus, if statistical fluctuations were the 
only factor in the uncertainty of the estimates, one might further 
reduce this uncertainty by some factor, roughly approximated by 
6 + V41,1385/960. This view of statistical uncertainty does not, 
however, give appropriate weight to the ‘‘model bias”, which will not 
be eliminated by any number of observations. Third, all the summary 
statistical measures, which are referred to as standard errors, correla- 
tions, confidence regions, etc., should be used and interpreted in a 
data analytic way, i.e., as indicating facets of the body of data and the 
adequacy of its description by the model and analysis—rather than in 
terms of some supposedly “‘true’’ model or hypothesis which one is 
trying to evaluate in probabilistic terms. 


B.5 Sums of Squares Contours, “Confidence Regions” and 
Nonlinearity Indices 


The models of Section IV are defined up to the values of the un- 
specified coefficients. Any set of values for these coefficients may be 
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said to provide a “fit” to the 960-point sample of data. Thence one can 
define a sum of squares function of the set of coefficients as 


SS (coefficients) = 3 (observed — “‘fitted’’)’, (30) 


which will take on various (positive) values as one varies the values 
of the coefficients. In the space of the coefficients there exist then, in 
principle, contours of this “sum of squares” function. 

While standard errors provide information on reasonable allowances 
for the estimate of a single parameter in the light of the fit of the 
model to the actual body of data, they do not carry any information 
on the joint statistical properties of the estimates. A reasonable (ro- 
bust and primitive) indication of joint statistical behavior is provided 
by these ‘sum of squares” contours in coefficient space. 

In the case of models in which the unknown coefficients occur lin- 
early, these contours are a family of ellipsoids defined by certain sim- 
ple quadratic functions of the coefficients. The orientation and shape 
of this family of ellipsoids indicate the interdependence of the esti- 
mates of the coefficients in the light of the data, and show which 
coefficients are well-determined and which poorly. However, the in- 
terpretive value depends heavily on geometrical appreciation and, for 
more than a few coefficients, high-dimensional representation cannot 
be achieved directly. 

The ellipsoid (even in the linear case) is not defined, in general, by 
its one-dimensional projections. (The standard error of a coefficient 
estimate is half the length of the projection of the unit ellipsoid of 
the family onto the coefficient axis.) But, as a matter of simple geo- 
metrical fact, all pairs of two-dimensional projections do uniquely de- 
fine the ellipsoid. Thus, one practical means of a complete graphical 
representation of the high-dimensional ellipsoid is in terms of all possi- 
ble pairwise planar projections. 

For the case of linear models, on the basis of a series of assumptions 
——namely that the differences between the model and the observations 
are due to statistical fluctuations which are normally and independ- 
ently distributed all with zero mean and the same variance—some may 
choose the abstract probabilistic interpretation of these ellipsoids as 
“confidence regions” (see, for example, Wilks*). If this interpretation 
is used, it is necessary that the distinctions and relationships between 
the joint, pairwise and marginal confidence coefficients and regions or 
intervals be understood. Details will not be provided here. Briefly, if 
a nine-dimensional ellipsoid were specified to have a confidence coeffi- 
cient of By, then any two-dimensional projection would have a con- 


1436 THE BELL SYSTEM TECHNICAL JOURNAL, SEPTEMBER 1967 


fidence coefficient of B2, interpreted marginally. The relation between 
By and Be is indicated by the following: 





By B. 
0.13 0.90 
0.25 0.95 
0.50 0.984 
0.75 0.997 
0.90 0.9994 


0.95 0.99995 


In the present case, the model is nonlinear and the fluctuations are 
not normal. Contours of the sums of squares function as a function of 
the coefficients can, in principle, be obtained for a given body of data 
and will not be ellipsoids. In practice, however, obtaining these con- 
tours is so laborious as to be virtually impossible. 

However, one may consider a linear (planar) approximation to the 
nonlinear model in the neighborhood of the least squares estimates of 
the coefficients and thence obtain expressions for a family of ellipsoids 
which may be reasonably good approximations to contours of the sums 
of squares function. An index of the effective nonlinearity of the 
model is the nonconstancy of the sums of squares of residuals on these 
ellipsoids and this can be normalized by division by the value of the 
minimum sum of squares. Such measures are presented and discussed 
in Sections VIII and IX. 

Given that the linear approximation is adequate, the nonnormality 
of the observations should not deter those who seek (and who believe 
in) the general probabilistic confidence interpretation since the statis- 
tical process is likely very robust. 

Sections VIII and IX contain specific examples of some of the pair- 
wise projections of these “approximations to sum of squares contours.” 
Specifically, the size of the 9-dimensional ellipsoid was such that, if 
all the statistical assumptions applied, a joint 0.99 confidence coeffi- 
cient could be attached. Since a complete set of pairwise projections 
for nine coefficients involves 36 ellipses only a few are shown. As a 
summary indicator of the nature and behavior of these ellipses the 
quantity 


a = (signof p)-(1 — V1 — p’) (31) 
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is tabulated (in Tables IV and V), where p is the correlation of the pair 
of coefficients involved. The value of 1 — |a| is the ratio of the area of 
the actual ellipse to that of the largest ellipse which could be inscribed 
in the rectangle formed by the horizontal and vertical tangents to the 
actual ellipse (see Wilk**). The range of a is —1 S a S 1 and large 
values of |a| (say above 0.75) corresponds to narrow ellipses with major 
axis oblique to the coordinate axes, and represent situations of high 
interdependence of the coefficient estimates. 


B.6 The Analysis of Variance 


The analysis of variance provides a summary description of the 
apportionment of the “variability” of a body of data in the light of 
the model employed for analysis, where variability is defined in terms 
of sum of squares. 

Given 7 observations, one may visualize an n-dimensional observation 
space, whose coordinates represent the possible values of each of the 
n observations. The data are then represented by a fixed point in this 
space. 

The model, having p unspecified coefficients, implies certain functional 
relationships amongst the coordinates of the observation space. Thus 
the model effectively defines a constraining ‘surface’ of p dimensions, 
and each point on this surface corresponds to some set of values of 
the unspecified coefficients of the model. The least squares estimate 
of the coefficients corresponds to that point on the constraining surface 
which is closest to the actual data point. If the coefficients in the model 
occur linearly then the constraining surface is a hyperplane which 
ordinarily, by definition of the observations, contains the origin, and, 
if the model includes a constant term, also contains the equiangular 
line (corresponding to the mean). 

The squared distance of the data point to the origin is then the total 
sum of squares, >. Y?, while its shortest squared distance to the 
constraining surface is the error or residual sum of squares, associated 
with lack of fit. The difference between these may be termed the model 
sum of squares and, for linear models, this is actually the squared 
distance from the least squares estimates point to the origin.* If a 
constant term is included in a linear model, then the model sum of 
squares may be further decomposed additively in terms of the squared 


*In the linear case, the model sum of squares is easily computed directly as 
the squared length of the projection onto the hyperplane of the line joining the 
data point and the origin. This fact is used in the present iterative computer pro- 
gram in checking convergence. 
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perpendicular distance (call it D?) of the least squares estimate point 
to the equiangular line and the squared distance (call it D3) along the 
equiangular line, from the foot of that perpendicular, to the origin. 
This latter quantity D} is usually termed the sum of squares due to 
the mean. The squared distance of the above-defined point on the 
equiangular line to the data point is called the corrected total sum 
of squares, >» (Y; — Y)? and is just >> Y? — D?. The ratio of the 
squared length D? to the corrected total sum of squares is defined as 
the squared multiple correlation, R’, and often used as a measure 
of accomplishment of a model. It is easy to show that R” defined above 
is equal to 

_ ___Sum of squares for error _, 

total corrected sum of squares 

This latter quantity is computable even when the model is nonlinear 
and/or does not contain a constant term. 

One may define contours of sums of squares of residuals in the con- 
straining surface as the loci of the intersections with the surface of 
given radii from the observation point. In the event that the constrain- 
ing surface was a hyperplane, which would be true if the unspecified 
coefficients in the model occur linearly, then these loci (or contours) 
would be a family of p-dimensional spheres. For nonlinear models, 
this will be approximately true for a sufficiently small neighborhood 
of the least squares point. 

The particular form of the model, in regard to the unspecified co- 
efficients, defines a coordinate system within the constraining surface. 
Three cases are worth distinguishing. First, the constraining surface 
is a hyperplane and the coefficients are linear. Second, the surface is 
a hyperplane but its coordinates are nonlinear. The second case may 
be reduced to the first by appropriately transforming the coefficient 
coordinate system. Third, the surface is nonlinear. In this case one 
can approximate the surface by a hyperplane in a small neighborhood. 
Thus, in a sufficiently small neighborhood, the situation can be re- 
garded as linear. 

The approximately or exactly linear coordinates implied by the 
model will in general be nonorthogonal. Thus, the representation of 
the spherical (exact or approximate) contours in an orthogonal co- 
ordinate system for the coefficients yields a family of ellipsoids. In the 
sense of measuring lack of fit by sums of squares between fitted and 
observed values, these contours in coefficient space constitute sets 
whose members are “equidistant” from the data point. 
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B.7 A Procedure for Smoothing in Cells 


In Sections 7.1, 13.3 and Appendix B.8, discussion of why and how 
to sample and possibly “smooth” the data has been given. One specific 
practical possibility is now described. 

Suppose one has a preliminary fit of the model, represented by 
g(w; ; 6), where 6’ = (6,, 62, --- , 6,) are the fitted coefficient values 
and w; denotes the independent variables. Suppose this preliminary 
fit is used to partition the space of the independent variables (here 
x and ZL) into some approximation of equirange cells, as described 
earlier. As argued in Section XIII, it may be profitable to “smooth”’ 
the data in each cell so as to yield a value generally representative 
of all the observations in that cell, instead of using a random selection 
from the cell. 

A sensible smoothing function for each cell is, clearly, the model 
g(w; 6). A simple procedure is, for each cell separately, to carry out 
one stage of linear adjustment, doing the linear least squares regression 
of {Y; — g(w; ; 6)} on dg/d0, |g, --- , Ag/08, |g , to obtain the regression 
coefficients 6’ = (6,, --- , 6,), for that particular cell. Then the smooth- 
ing function for that cell would be g(w; 6) where 6 = 6 + 6. A rep- 
resentative “smoothed observation” for that cell might then be the 
quantity g(w; 6), where w is, say, the mid-point of the cell. 

This process permits each cell, overall, to determine a single value 
to represent it in the entire fitting process and diminishes the chance 
that a random selection from a cell may be unnecessarily nonrepresenta- 
tive of that cell behavior. 

If one had wished to fit to all the available data, then the smoothed 
cell values would be weighted in proportion to the number of data 
in the cell. In the present case, this was deliberately not done. 

The goodness of fit of a model to smoothed cell values, not dif- 
ferentially weighted, cannot be statistically judged directly from the 
analysis of variance since the residuals are no longer individually 
statistically comparable and the mean square residual is not an estimate 
of the error variance of the observations. However, the fitted model 
can be assessed by functions of its residuals from the original data 
(or a sample thereof). 


B.8 Probability Plotting 


The techniques of probability plotting are useful for data analysis 
in a wide variety of circumstances. (See Wilk and Gnanadesikan?’ for 
a general discussion of probability plotting techniques.) For instance, 
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in the present work, plots of residuals against various variables have 
provided invaluable guidance, but one is also interested in the dis- 
tributional behavior per se of the collection of residuals. As presented 
in Section 8.1, normal and half-normal probability plots have been 
used for this purpose. 

The rationale for such probability plots is roughly as follows: If one 
draws a random sample of size n from a population which is normally 
distributed with mean » and variance o* then the ordered observations 
would be expected to approximate, roughly, to a linear function, » + 
o%(n), of appropriate “representative” values z;(n) from a standard 
normal (»% = 0, o? = 1) distribution. Thence a plot of the ordered ob- 
servations against the z;(n) would tend to be linear, with intercept 
approximately » and slope approximately co. For the representative 
value, z;(%), corresponding to the ith ordered observation, one can use 
the standard normal quantile for the proportion (i—4) /n. 

This plotting technique displays the individual observations in a 
sample graphically and does so against a backdrop such that the ex- 
istence of outliers and asymmetry, as well as other distributional prop- 
erties, are sensitively indicated. Of course such plots are usually profit- 
ably supplemented by others that order or partition the data according 
to information extraneous to the responses themselves. 

We expect the mean of the residuals, Y — y, in the present study (see 
Section 8.1) to lie near 0. Also we expect that their variances will be 
approximately the same, since that is the purpose of the square root 
transformation. As a further benefit of the square root transformation 
we expect that the distribution of the residuals will tend to be sym- 
metric and to approach normality; thence the present application of 
normal probability plotting of the residuals. The fact that these resid- 
uals are not entirely statistically independent—since they derive from 
a commonly estimated fitted function—is a minor issue since the num- 
ber of observations is so much larger than the number of fitted coeffi- 
cients. 

Half-normal probability plotting employs the ordered absolute re- 
siduals plotted against standard half-normal (standard normal folded 
at 0) distribution quantiles. Such a plot eliminates any symmetry-type 
information but provides an incisive focus in bringing together on the 
plot the largest departures from fit. 

Probability plots can provide very sensitive indications of distribu- 
tional peculiarities especially in regard to “overly” large values. Some- 
times the indications are of little practical interest, such as minor 
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lumps which one can see in Fig. 29, but in other regards, such as in 
estimating an “intrinsic” error standard deviation, the plots may per- 
mit a good judgment on how to discount or correct for apparently 
aberrant values which might otherwise have an undue influence, say, 
on mean square error. 

Error standard deviations may be estimated from normal or half- 
normal probability plots as the “slope of the linear configuration.” 
Typically, it will not be relevant to make a great show of objectivity 
in this process since the declared purpose is to permit an informal dis- 
counting of unexpected distributional peculiarities. Thus, in Fig. 29, 
one takes the slope as defined essentially by the bulk of residuals, 
ignoring the few largest. 


APPENDIX C 


Statistical Measures Over All the HTB Data 


This appendix presents various statistical measures over all the 
41,135 HTB data. These measures concern the fit of Models I and II 
and the partition of the x,L space (as described in Section 7.1 and 
Appendix B.3) into 1034 cells of which 813 were nonempty of obser- 
vations. The partition is such that the range of y within cells is rela- 
tively small. For each cell, two functions are used: (7) The mean 
square deviation (MSD) defined as 


MSD (u) = fo > (u; — 0, (32) 


where the cell has n observations and u; denotes some function of a 
cell observation, e.g., Y; or Yj, and @ is the mean of the wu; in the cell; 
(17) The mean square residual (MSR) defined as 


MSR (Y) = 2 (Ys — 9, (33) 


where y; is the fitted value (from Model I or II) corresponding to the 
observed Yj. 


C.1 Empirical Justification of Square Root Transformation 


Figs. 56 and 57 show plots of MSD(Y?) versus the cell mean of Y? 
and MSD(Y) versus the cell mean of Y, respectively. It is seen that 
MSD(Y?) shows a distinct and major dependence on the average 
value of the counting rate, Y?, while MSD(Y) does not show syste- 
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matic increase relative to the average value of Y, except, as expected, 
in the close neighborhood of zero counting rate. 

A more detailed analsis of the results of Fig. 56 indicates that the 
dependence of MSD(Y?) on cell mean of Y? is somewhat curvilinear 
having larger slope for larger Y? values. This curvilinearity is very 
likely mainly due to the mode of definition of the x, cells. The 
procedure used tends to produce cells which are ‘too large” in regions 
where the counting rate is also large, thus leading to an apparent 
extra increase in MSD(Y*) with Y?. At all values, however, the 
dependence of MSD (Y?) on Y? is greater than the slope 0.09 (=1/11) 
which would be associated with the Poisson distribution. The em- 
pirically observed slope varies from about 0.15, based on small values, 
to 0.8, based on large values of the MSD(Y?). 

These results suggest that one cannot hope to achieve, by means of 


CELL MSD(yY2) 





4 
CELL MEAN(Y®) 


Fig. 56 — Cell MSD (Y*) vs cell mean of Y? for the z,Z cells defined in Sec- 
tion 7.1 and Appendix B. 
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CELL MSD(Y) 





CELL MEAN(Y) 


Tig. 57— Cell MSD (Y) vs cell mean of Y for the z, L cells defined in Section . 
7.1 and Appendix B. 


any fitted model based on 2,L coordinates, on the scale of Y, a mean 
square residual (error) as low as 0.023 which is associated with the 
Poisson assumption. 

Although the Poisson assumption provided a useful stimulus toward 
a profitable transformation of the data, these results confirm that 
it would have been unwise to have tied oneself too closely to the 
assumption as a complete basis for analysis, as for instance in basing 
the fit on maximization of the Poisson likelihood function (see 
Appendix B.2). Possible sources of variability and error in the data, 
beyond Poisson fluctuations in counts, have been discussed elsewhere 
in this paper. 


C.2 Determination of Weights 


The sample selection procedure involved “weighting” the 8138 non- 
empty cells by selecting 2, 3/2, or 1 observation per cell. The observed 
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MSD (Y) were classified into three groups defined by:0 S MSD 
0.013; 0.0138 < MSD S 0.02; 0.02 < MSD. The z,Z coordinates of the 
midpoints of cells so identified are shown in Fig. 58. The actual assign- 
ment of weights was based on applying contiguity considerations to this 
plot. 


C.3 Analysis of Variance Over All the HTB Data 


Table VIII summarizes the analysis of variance over all the 41,135 
HTB data. The table covers the fit of Models I and II to all the data, 
using the estimated coefficients (see Tables IV and V) from the fit 
to the 960-point sample. Also, one can regard the collection of 
averages of the Y values in each of the 813 nonempty cells as pro- 
viding a fit depending on 813 fitted quantities. The corresponding 
“error” (cell deviations) is the pooled cell MSD(Y). Finally, the 
residuals of the fit of Model I (or II) can be “fitted” by 813 cell 
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Fig. 58 — Positions of centers of regions in 2, Z space having certain ranges of 
cell MSD (Y). The ranges are indicated in the legend. 
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TasBLE VILI— ANALYSIS OF VARIANCE OVER ALL THE HTB Data 
(41,1385 Points Minus 226 OuTLiERs). 














Due to d.f. Sum of squares Mean square 
Total (41, 135-226) 40 ,909 230 , 267.45 
Mean 1 115,755.39 
Corrected total 40 ,908 114,512.07 
Model I residuals 40,900 1,411.3 0.0345 
Model ITI residuals 40,901 1,419.6 0.0347 
Cell deviations 40 ,096 1,541.4 0.0384 
Cell dev. of Model I res. 40 ,085 1,167.0 0.0291 
Cell dev. of Model II res. 40 ,086 1,166.9 0.0291 








Multiple R2 value 


Model I 0.988 
Model ITI 0.988 








averages of the residuals, leading to an “error” which is the pooled 
cell MSD(Y — y), ie., due to the cell deviations of the Model I (or 
II) residuals. 

The fits to all the data provided by Models I and II are equally 
good, as was true for the 960-point sample. The mean square resid- 
uals over all the data (about 0.035) is lower than the value (about 
0.036) obtained for the sample even though the fit of the model was 
determined by the sample. This bears out the expectation (see Ap- 
pendix B.3) that the mode of selection of the sample is such that the 
sample was harder to fit on a per-observation basis than the entire 
body of data. 

The cell means provide overall a poorer actual fit than Model I or IT, 
and allowing for the number of fitted coefficients, the mean square for 
cell deviations exceeds that for the models by about 12 percent. 

Fitting cell means to the model residuals yields an additional sub- 
stantial reduction in the sum of squares of the model residuals and a 
mean square of about 0.029, which is some 17 percent lower than the 
value for the models. If in fact the models gave an “unbiased” fit every- 
where, then one would expect that the values of pooled MSR(Y) and 
pooled MSD(Y — y) would be nearly the same. The excess of the 
former is due mainly to systematic inadequacies of the fit (see 
Appendix C.4). 

The value 0.029 represents virtually a lower bound on the achiev- 
able ‘mean square error’ for this body of data. Despite its downward 
bias from the substantial number of ‘zero counting rate’ observations, 
it exceeds the ‘Poisson’ variance of 0.028 by about 25 percent. This 
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excess is probably due to a combination of factors, including incom- 
plete elimination of temperature and bias voltage instrumental effects, 
as discussed further below. 

The ‘improvement’ of the MSD(Y — y) over MSR cannot be taken 
to mean that some smooth “simple” adjustment of the model based on 
x,L coordinates might be found so as to yield similar improvement. 
Some of the bias apparently associated with x,L coordinates in dif- 
ferent regions may be due to artifactual association with temporal, 
instrumental, or other small effects and such corrections could not be 
made overall in terms of a “simple” x,L dependence. 


C.4 Analysis of the Excess Variation 


A study of plots of cell MSD(Y) against each of y, x, and L 
indicates that large MSD values occur mainly in the 12 < L < 1.7 
region, at high average counting rates. This excess is due largely to 
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Tig. 59 — Cell MSR (Y) vs central value of 7’ for cell. 
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Fig. 60 — Positions of centers of regions in x, L space having certain ranges of 
cell MSR(Y)/MSD(Y). The ranges are indicated in the legend. (Plotted 
points are mid-points for the cells. Points appearing to the right of the boundary 
R = 2.0 R. represent cells which have data only in one corner.) 


the hybrid mode of 2,L cell formation, in which the L-slices were 
equal length intervals, while within each L-slice, the x segments were 
chosen to have equal increments of y. Thus, at Z values where y is 
large, the x, cells will tend to have a larger y range. 

The tendency of MSD to rise with cell average counting rate is 
not mirrored by cell MSR behaviour. As shown by Fig. 59, the level 
of MSR is not dependent on y except, as expected, for those cells 
where the counting rate is near zero. Roughly speaking, the average 
level of cell MSR for y values away from zero is about 0.04, in agree- 
ment with the probability plot estimate of Section 8.1. Of course, Fig. 
59 shows both smaller ordinate values and less dependence on the 
abscissa values than the comparable plot of Fig. 57. 

The relation of cell MSR to cell MSD is partially indicated in Fig. 
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60, showing positions in z,L space of various ranges of the value of 
the ratio MSR/MSD. One sees that MSR tends to exceed MSD along 
the “outside” of the data region. The excess along the R = 2 R. 
boundary is due mainly to model bias or inadequacy. The excess at 
high L-high zx is probably associated with temporal effects. The large 
ratios along the left edge of the data, which is the cutoff region, is 
likely a reflection of deficiency of the function. The excess of MSR 
over MSD is associated in the main with small y values. 

Fig. 61 shows cell mean square deviations of residuals, MSD(Y —y), 
plotted against y. This plot shows less scatter (most noticably for 
MSD(Y — y) > 0.075) than that of Fig. 59, and a lower average level 
of MSD(Y — y) for y > 0, as expected. The high values of MSD (Y — y) 
are not related to y as such but rather, as other plots show, with 
“extra fluctuations” in the 1.2 < LZ < 1.7 region. This is probably 
associated with the coarse HTB data partition which does not com- 
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Fig. 61— Cell MSD (Y — y’) vs central value of y’ for cell. 
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pletely take care of the temperature and bias voltage instrumental 
effects. 
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Gold Doped Silicon Compandor Diodes 
For N2 and N3 Carrier Systems 


By K. R. GARDNER and T. R. ROBILLARD 
(Manuscript received May 2, 1967) 


Companding has proven to be a valuable technique for improving 
the signal-to-noise ratio of voice transmission at baseband frequencies. 
A compandor consists of a compressor element which reduces the dy- 
namic range of a transmitted signal in a predetermined manner and an 
expandor element which restores the signal range at the receiver. Prac- 
tical Bell System applications to date have used electron tubes, ger- 
manium point-contact semiconductor diodes and unpassivated silicon 
mesa diodes. Each of these variolosser elements had serious short- 
comings. Two new diode pairs have been designed which eliminate the 
problems of impedance range control and linearity, diode noise and 
electrical stability. The new design utilizes heavy gold doping of a 
planar oxide-passivated wafer design to produce a bulk controlled de- 
vice capable of unusually high manufacturing yields. 


I. INTRODUCTION 


Compandors are a special application for a diode because the diodes 
are used as variolossers and the electrical parameter which must be 
controlled is the small signal forward impedance as a function of bias 
current. Furthermore, control of impedance is required over two orders 
of magnitude. Other requirements are low noise and good stability. The 
484/489A and 484/489B diode pairs, which are electrically identical 
and differ only in mechanical outline, are silicon “planar” diodes which 
were designed for use in this application. The new devices were de- 
signed to replace two pairs of troublesome unpassivated “mesa” type 
devices in both the N2 and N38 carrier systems. 

A comprehensive diode design was not previously available for this 
application. Diodes were obtained by selection from available types at 
low yield. This paper discusses the theoretical and empirical design 
and the fabrication of the new diodes. 
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II. CIRCUIT FUNCTION OF COMPANDORS 


Noise is an important problem associated with long distance tele- 
phone transmission. Elimination or reduction of this undesirable effect 
is a consideration in the design of all transmission equipment. Noise 
occurs in transmission from many sources such as thermal noise, ex- 
ternal interferences, and crosstalk. The compandor circuit, which was 
first introduced into the Bell System in the transatlantic radio circuit 
in 1932, is one of the methods used to reduce noise. While the first 
compandor circuits used vacuum triodes? as the variolosser units, later 
compandor circuits used semiconductor diodes when they become 
available. 

A compandor? is composed of two-parts, compressor and expandor, 
one at each end of a transmission path. The compressor circuit com- 
presses the dynamic range of the transmitted signal power by taking 
the square root of the signal (although other functions could be used). 
If the maximum signal levels are transmitted at the same power with 
compression as they would be without compression, then the minimum 
signals will be transmitted at relatively higher power with compression 
than without. Therefore, a higher signal-to-noise ratio results for the 
minimum signals on the transmission path if compression is used. At 
the receiving end of the transmission line the expandor circuit squares 
(expands) the signal to its original dynamic range. 

The N2 and N38 carrier system compandors? compress a 60-dB signal 
range into a 30-dB range for transmission. Therefore, 30-dB higher 
noise may be potentially tolerated in the transmission path. At the 
receiving terminal the expandor portion of the compandor expands the 
signal range to its original value of 60 dB and restores the signal to its 
original form. Since a compandor is an interchangable plug-in unit and 
the compressor and the expandor in the same unit do not work together, 
it is necessary that all compressor circuits track closely with all ex- 
pandor circuits. 

The core of the compressor and expandor circuits is a pair of vario- 
losser diodes. The stringent requirements on the compandor circuits are 
reflected in stringent requirements on the variolosser diodes. This paper 
reports the development of two diode pairs which meet the unique re- 
quirements of these circuits. 


III. THE DIODES 
3.1 General Description 

The first semiconductor diodes used as the compandor variolossers 
were selected from available types. While the New York-London long 
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wave radiotelephone circuit (1932) used vacuum triodes,! the NI 
carrier system used germanium point-contact diodes;* the P1 and O 
carrier systems used silicon alloy diodes;> and the N2 carrier system 
originally used diffused silicon, mesa diodes. Several problems 
arose with the use of these state-of-the-art-diodes although care- 
ful selection and circuit adjustment could correct most of them. The 
major problems of high device cost, high noise, and periodic supply 
shortages arose directly from a lack of understanding of the physical 
mechanism controlling the forward impedance characteristic. 

It was possible, by designing a new diode, to overcome all of the 
problems and at a much lower cost. The new design uses silicon planar 
techniques coupled with controlled gold doping and heat treatment to 
produce the desired diode characteristics. 

The following parameters are used to characterize the diodes for the 
compandor applications: 


(1) The small-signal forward impedance,* Z,, at a specified mid- 
range de dias current. 

(wz) The ratio, R,, of the small-signal forward impedance at a 
specified lower current to the impedance at the above mid-range 
current. 

(wz) The ratio, Re, of the impedance at the mid-range current to the 
impedance at a specified higher current. 

(iv) The impedance difference between the diodes of a pair measured 
separately at the idling current. 

(v) Noise generated by the diode over the current and frequency 
ranges of interest. 


Table I shows the limits for these parameters for both the mesa and 
the planar type diodes. 


3.2 Design Theory 

The primary parameter to be controlled was the small-signal for- 
ward impedance, Z;. The theoretical forward impedance of the semi- 
conductor diode may be obtained by differentiation of the current- 
voltage equation. For semiconductor diodes the relationship is: 


I, = I,(exp qV/nkT — 1), (1) 


where 


I, = saturation current, 


*For simplicity the expression ‘small-signal forward impedance’ will often be 
shortened to ‘forward impedance’ or ‘impedance’ in this paper. 
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q 
V 
n 


electronic charge, 
applied voltage, 
= experimentally determined constant 


commonly between I and 2, 
¢ = Boltzmann’s constant, 


= 


Therefore, 


_ OV _ nk 
a a 


= absolute temperature. 


nkT 
qd, 


exp — gV/nkT. 
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For forward bias, V, greater than a few nkT/q (kT'/q = 0.026 volts 


TABLE I — SALIENT CHARACTERISTICS 


Parameter 


Z;, Forward impedance, at 
50 wA de bias 
For single diode of pair 


For diode pair in series 
Ri, Impedance ratio = 
Z; at 10 wA de 


Z, at 50 pA de 
For single diode of pair 


For diode pair in series 
R., Impedance ratio = 
Z, at 50 pA de 


Z; at 300 pA de 
For single diode of pair 


For diode pair in series 
Parameter 


AZ; = Difference in impedance 
of diodes of pair measured 
separately: 


At 2 wA de bias 
At 10 wA de bias 


vn = Noise voltage of single 
diode or pair at 2.5uA de 
bias. Bandwidth 200- 
3500Hz. Parallel resist- 
ance 17,000 ohm... 


* Computer selected. 


Planar compressor 


Mesa compressor 





and expandor and expandor Units 
900 + 35 1045 + 125 ohm 
1800 + 70 2070 + 70* ohm 
5.0 + 0.2 4.9 + 0.4 —_ 
5.0 + 0.2 4.9 + 0.2* — 
6.0 + 0.2 6.2 + 0.5 = 
6.0 + 0.2 6.15 + 0.2* — 
Compressor only Compressor only Units 
2000 max 2000 max ohms 
500 max 500 max ohms 
20 max Se pVrms 
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at room temperature) one has exp qgqV/nkT > 1 and to a good approxi- 
mation* 
nkT 1 
Z;= “@ ie (2) 

It is this Z;-Ip functional relation which is used in the design of the N2 
and N38 compandor circuits. 

There are five sources of current in a forward biased p-n junction; 
diffusion, bulk recombination, surface recombination, channel and 
tunneling currents. The total diode current is given by 


I, = Ip (diffusion) + Isr (bulk recombination) 
+ Isp (surface recombination) + I¢;, (channel) 


+ I, (tunneling). (3) 
The diffusion current’ at small bias is given by 
Ip = I,(exp qV/nkT — 1), (4) 
where 
I, = gA[p,(D,/r,)* — n(D,,/T2)* 
and T> = lifetime of holes on n side of junction, 


t, = lifetime of electrons on p side of junction, 
p, = hole concentration in n-region, 

MN» = electron concentration in p-region, 

D, = diffusion constant for holes, 

D,, = diffusion constant for electrons, 

A = area. 


At high forward bias (injection) the current will be given by 
In = I,{exp qV/2kT — 1) (5) 


and in the intermediate range the current equation will be similar to (1). 
The bulk recombination current’’® for bias voltages, V, greater than 
several kT’/q is given by 


Ter = Le exp qV /2kT, (6) 
where 
iii if) qs 
Ing = 5 (HE) a, 
nm; = intrinsic carrier concentration, 


* The error is less than 1 percent for voltages greater than 0.2 volts and less 
than 0.1 percent for voltages greater than 0.3 volts. 
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E 


and To 


electric field at junction, 
lifetime. 


The exact voltage dependence of (6) depends in a complicated way 
on the physical parameters of the junction, but in a given range may 
be described by 


sr = I,, exp qV/n,,kT, (7) 
where n,, S 2 and accounts for the voltage dependence of J,, , while 
I,, is the voltage independent factor. 

Surface recombination current may be described by a similar equa- 
tion; 


Isr = I,, exp gV/n,,kT, (8) 


where n,, > I. 
Channel current® at V > kT'/q may be described by 


ep = I. exp gV/n,.kT, (9) 


where n,. = 1 up to 4 or 5 or more for poorly stabilized surfaces. 

By considering the five currents in parallel one may calculate a 
small-signal impedance for each current, and the forward impedance 
of the diode may be expressed as five impedances in parallel. 


1 1 i 1 





1 1 
Z, WE * mkt a kT t nak? * Z, ®) 
ql p Ql pr Ql sr Qlot 
Bo as ( Tor , Isr , Lex Hr). 
a Z, kT SDN Nee +r Ner +r Net - qZr (10) 


Diffusion current cannot be made dominant over recombination 
current in silicon except at high current densities where the value of the 
multiplier, n, may be modified by carrier injection. In both mesa and 
planar diodes the diffusion current was reduced by using heavily doped 
starting material (approximately 0.005 ohm-cm p-type silicon). The 
use of such low resistivity material had the further advantage of re- 
ducing the series resistance of the bulk silicon to about 0.04 ohms. At 
all currents of interest the 0.04 ohms made a negligible contribution to 
the diode impedance. 

The bulk recombination current was greatly enhanced by introduc- 
ing trapping centers by heavily gold doping the diodes. The effect on 
the diode parameters of various gold doping levels was investigated by 
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varying temperature and time of gold diffusion and subsequent heat 
treatments. These effects are discussed in detail in Section 3.4.3. 

From the previous equations and measurements on existing diodes 
the relative values of the five currents and their contributions to the 
total impedance may be compared. As an example, values are calcu- 
lated for a bias of 0.4 volts and diode parameters which approximate 
those of the actual diodes. 

The diffusion current is calculated from (4) as 


In = 10°°A = 0.001n4, 


where the following values are assumed: 


T = T, = 10° sec (Ref. 9) 

N, = N, = 2 X 10° em“ (0.005 ohm-cm) 
Up = 30 cm’/volt sec (Ref. 10) 

Bn = 75 cm’/volt sec (Ref. 10). 


The bulk recombination current is calculated from (6) by using the 
approximation 


E = (bo — V)/W, 


where y is the built-in voltage and W is the space-charge width. The 
bulk recombination current is 


doe = 30uA > i = 0.001pA. 


Estimates of surface recombination and channel currents were made 
from measurements on planar type diodes. These estimated values 
were much smaller (by several orders of magnitude) than the bulk re- 
combination current. Likewise, the observed magnitude of the forward 
tunneling current is negligible since doping levels are relatively light 
and the junction is graded. 

Hence, bulk recombination current is dominant and the junction 
impedance becomes 


nkE 1  tkT 1 


7 ae 
‘ qg Ip q Igr 


(11) 

By way of contrast the impedance of the mesa diode was primarily 
dependent on surface damage introduced during mechanical formation 
of the active diode wafer. High surface recombination (mechanically 
damaged) wafer edges were created when the diffused slices were dia- 
mond sawed into wafers. A portion of this damage was then removed 
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by chemical etching to set the diode multiplication factor, n, and hence 
forward impedance to the nominal value. 


3.3 Mesa Diode Deficiencies 


System and manufacturing experience pointed out three major 
shortcomings of the unpassivated mesa design. The first of these short- 
comings was lack of good control of the nominal impedance value and 
range. Manufacturing problems were experienced until the planar re- 
design efforts delineated the physical mechanisms controlling forward 
impedance. Even with this understanding manufacturing control could 
not be improved sufficiently to obtain narrow distributions of im- 
pedancc; a computer selection of individual diode pairs was necessary 
for reasonable yields. 

A second electrical characteristic which could not be controlled 
in the manufacturing operation was the noise voltage produced by the 
device in the 200-3500 Hz band. The N2 and N38 systems require that 
the noise voltage be less than 20 microvolts for the compressor pair 
and 40 microvolts for the expandor pair when operating at a direct 
current of 2.5 microamperes. This characteristic was checked on a non- 
parametric basis at the equipment assembly location; and, quite fre- 
quently, shipments of diode pairs would be found which exhibited 
excessively high noise. 

Finally, the short-term stability objectives of the systems could 
never be achieved with the unpassivated device. 

As shown in this paper, the redesigned device readily meets all noise 
and stability objectives and permits manufacture of diodes at very 
high yields without the need for computer matching. 


3.4 Planar Diode Design 


3.4.1 Structural Features 


While the primary compandor diode design effort was directed toward 
understanding and controlling the physical variables associated with 
the active semiconductor chip, the encapsulating structure was also 
changed to provide an assembly more suited to printed circuit board 
mounting. As shown in Fig. 1, each diode pair of the mesa type was 
composed of two metal package diodes molded in epoxy and glued to- 
gether with an epoxy cement. This arrangement is costly and results in 
a double ended structure whose leads must be trimmed and formed for 
mounting. The redesign diode structures are simply two TO-18 en- 
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l=} 
ey, (J 
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Fig. 1 — Outline of mesa diode pairs and package outline. 


a) 


capsulations which are snap fitted into an acytal copolymer plastic 
block. The lead arrangement shown in Fig. 2(a) was used for im- 
mediate production and field replacements; the straight-through lead 
arrangement shown in Fig. 2(b) is being used in new equipment which 
incorporates modified circuit boards. The latter structure requires 
neither lead trimming nor forming for insertion. Code and date mark- 
ings are molded into the plastic carrier which eliminates the need for 
coding individual finished devices. The plastic carriers are bullet- 
shaped to identify polarity and are color coded to provide positive 
differentiation of compressor and expandor pairs in the equipment as- 
sembly areas. The leads of the device are solder coated to facilitate 
wave soldering to printed circuit boards. 

The essential features of both mesa and planar type wafers are 
depicted in Fig. 3. Fig 3(a) shows the mesa structure used in the 
earlier diode. In this case, a p-n junction is formed approximately 





PLASTIC 
i MOUNT 
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00 0 
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484 A/B 489 A/B 


Fig. 2— Outlines of 484A/B and 489A/B diode pairs. 


1460 THE BELL SYSTEM TECHNICAL JOURNAL, SEPTEMBER 1967 






~~__ DIODE 
JUNCTION | 





\ 
/ \ PLANAR 
DIODE ALUMINUM OXIDE 
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Fig. 3—(a) Mesa structure of component diodes. (b) Planar structure of 
component diodes. 


0.0002 inch below both faces of a one-inch diameter silicon wafer by 
gaseous diffusion producing a p-n-p structure. One of the p-n junctions 
is then removed from the wafer by mechanical lapping. The lapped 
slices are next plated with nickel and gold to form ohmic contacts. The 
wafers are then cut into 0.045-inch square chips by a diamond sawing 
operation to produce the final chip. This chip is subsequently eutectic- 
bonded to the package mounting stud, etched to remove a controlled 
amount of sawing damage (thus adjusting the impedance to the nom- 
inal value), and finally spring contacted during final encapsulation to 
complete the device. A cut-away view of this structure is shown in 
Fig. 4. 


DIODE WAFER 





C3 


a 
SOLDER JOINT ~ 
Pe \ 
WELD WELD 


Fig. 4 — Cut-away view of mesa diode package. 
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Fig. 3(b) depicts the mechanical features of the planar wafer. In this 
case, 4 p-n junction is formed in the p-type silicon by diffusing phos- 
phorus through a 0.008-inch hole cut into the protective layer of silicon 
dioxide. Since the starting crystal is very heavily doped with boron 
(approximately 2 x 10'° atoms/cc), it is difficult to overdope this ma- 
terial and produce a deep junction; in this device, the junction lies 
0.0003 inch below the initial surface of the silicon. As explained else- 
where, the silicon is also very heavily gold doped by a high tempera- 
ture diffusion to control the recombination-generation current and 
thus the diode multiplication factor which in turn controls diode im- 
pedance. An aluminum contact is evaporated and alloyed selectively 
into the hole in the oxide to complete wafer fabrication. The planar 
wafer is next eutectic-bonded to a gold-plated TO-18 header and a 
thermocompression wire bond is made between the metal button and 
the header lead. Final closure of the device is accomplished by resis- 
tance welding a Kovar can to the gold-plated Kovar header. A barium 
oxide impregnated porous nickel cylinder is brazed to the top of the 
can and serves as a moisture getter. A cut-away view of the individual 
planar device is shown in Fig. 5. 


3.4.2 Fabrication Process 

Many of the basic processes used to fabricate these diodes are com- 
mon to other planar silicon devices and have been presented else- 
where.) 2 This section, then, will deal mainly with those processes 
which determine the forward impedance, noise and stability aspects of 
the device. A basic flow chart of the major assembly operations is 
presented in Fig. 6. In this chart, the header assembly operations, 
getter fabrication, activation and assembly operations and the semi- 
conductor crystal growing operations are not shown. 
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Fig. 5 — Cross-section view of planar component diode. 
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_ Fig. 6— Flow chart for fabrication of planar diodes. Only fundamental opera- 
tions are shown. 


The first fundamental design choice involves the selection of resis- 
tivity type and doping level. The choice of p-type silicon allows use 
of a junction diffusant (phosphorus) which can be easily cleaned off 
of the surface of the wafer contact area. The choice of very heavily 
doped starting material is also of paramount importance in producing 
a stable device. It has been demonstrated™ that alkali ions, a universal 
source of contamination, can electrolize through a protective silicon 
dioxide layer at high temperature under reverse bias and invert the 
conductivity type of the p-type material surrounding the junction. 
This inverted area can cause high channel currents to flow and also 
drastically increase the capacitance of a device when operated under 
reverse bias. 

When operated in the forward direction, a “channeled” device will 
exhibit a multiplication factor of typically 2-4 and occasionally up to 
10. Obviously, such changes would drastically shift the impedance 
levels of the device. However, with starting material doped to a level 
of 2 x 10'° atoms/cm’, it is estimated from the curves of Ref. 14 that 
10** surface charges/em? would be necessary to invert the material. 
Contamination levels of this magnitude are not encountered if mini- 
mal care is exercised in the oxide growing, diffusion and contact evap- 
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oration steps. Hence, as discussed in Section 3.2, the choice of heavily 
doped starting material essentially eliminates the contribution of 
channel recombination-generation current to the total diode current 
when compared to the bulk recombination-generation current. Also, 
as calculated in Section 3.2, the diffusion current contribution to the 
total diode current for this (or any practical) starting resistivity is 
also negligible compared to the bulk recombination-generation current. 

The next pair of design choices, heavy gold doping and planar oxide- 
passivated technology, combine to produce a very large bulk con- 
trolled recombination-generation current and a negligibly small sur- 
face current contribution. As indicated in Fig. 6, the polished slice is 
first oxide passivated and then is selectivly etched to open 0.008-inch 
circular holes in the oxide using photolithographic techniques. Kodak 
Thin Film Resist (KTFR) is used as the emulsion in the photo-shap- 
ing operation. After junction diffusion, the junction assumes the shape 
shown in Fig. 3(b). The junction diffuses laterally as well as verti- 
cally. Lateral diffusion under the oxide layer provides a p-n junction 
which terminates at the semiconductor surface at a low surface charge 
location (under the passivating oxide). The low surface charge results 
in low surface recombination current. Thus, the resultant low resistiv- 
ity, planar, oxide-protected, heavily-gold-doped combination results in 
a device which is completely bulk controlled and capable of being 
predictably controlled in manufacture. 

The gold doping level must next be selected to provide the desired 
value of diode multiplication factor and hence forward impedance. 
Since many mesa devices are currently in field service, and since both 
the N2 and N83 were designed to accommodate this device, it was de- 
sirable to attempt to set the impedance level at a value of 1035 ohms 
at 50 microamperes or a multiplication factor of approximately 2. Since 
values of the multiplication factor at room temperature from gold 
doping as high as 1.85 had been reported in the literature,® this ap- 
proach appeared to offer promise of successfully achieving the desired 
objective. As shown in Fig. 7, the impedance level of the device is a 
very strong function of the gold diffusion temperature. As can be seen 
from this combined plot of impedance and maximum solid solubility*® 
of gold in silicon as a function of temperature, the impedance level is 
directly related to only the bulk properties of the device as calculated 
in Section 3.2 and discussed previously in this section. The impedance 
values presented in this plot were achieved with other impedance con- 
trolling variables held constant. In particular, the time and tempera- 
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Fig. 7— Forward impedance at 50uA bias current as a function of gold dif- 
fusion temperature. Diffusion time 10 minutes. Contact sinter time 3 minutes. 
Solid solubility after Trumbore. 


ture used for contact sintering to provide contact adherence were held 
at 3 minutes and 625°C, respectively. Fig. 16 presents the variation 
of impedance level with contact sintering time at 625°C. It can be 
seen that maximum impedance results when the contact is not sin- 
tered. Heat treatment of the gold-loaded slice (even without metal 
contacts present) results in lowered impedance probably through an 
oxide-gettering or precipitation mechanism. With minimum contact 
sintering, average values of impedance as high as 990 ohms at a for- 
ward current of 50 microamperes have been achieved for a gold diffu- 
sion temperature of 1300°C. The corresponding multiplication factor 
for these experimental conditions is 1.91. For good mechanical adher- 
ence of the contact it was desirable to sinter the contact at about 
625°C for 9.5 minutes (a standard process) ; hence, the impedance level 
of the redesigned device was set at 900 + 35 ohms for a gold diffusion 
temperature of 1300°C. This shift in impedance nominal from the mesa 
component diode nominal value of 1045 + 125 ohms necessitated a 
change of a few resistor values in the compandors. 

As discussed in Section 3.3, noise in the low audio frequency range 
was a serious problem with the mesa diode. While a detailed study of 
the physical noise mechanisms in silicon was not undertaken in this 
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development, design information was obtained which clearly indicates 
methods of controlling this important parameter. Control of the 1/f 
low-frequency noise results as a by-product of heavy gold doping for 
impedance control. As illustrated in Fig. 10, the noise voltage of the 
device in the 200-3500 Hz band when biased in the forward direction 
at 2.5 microamperes is independent of the gold level up to about 1200°C 
then drops sharply and begins to level out beyond 1300°C. Since. the 
mesa device saw no high temperature gold diffusion, no beneficial ef- 
fect of the gold was realized. 

As seen from Fig. 10, at the specified diffusion temperature of 1300°C 
the bulk of the planar component diodes are approaching the test set 
lower limit of 2.4 microvolts and no devices are approaching the com- 
pressor or expandor limits of 20 and 40 microvolts, respectively. This 
parameter is now easily controlled in manufacture; hence, both com- 
pressor as well as expandor limits have been set at 20 microvolts. 

After oxidation, diffusion and contacting, the slices are simply dia- 
mond scribed, cracked apart, eutectic (gold-silicon) wafer bonded and 
thermocompression wire bonded to the T0-18 header. Finally, the metal 
can containing an activated moisture getter is resistance welded to the 
assembled header. The excellent device stability which will be pre- 
sented in a later section is attributable to the use of very low resistiv- 
ity semiconductor material, to extremely high gold doping and to the 
use of oxide passivation techniques. 

The design factors discussed in this section combine to produce a 
device with a very narrow range of impedance, a low noise voltage, 
extremely stable electrical characteristics and which can be produced 
with good manufacturing control. 


3.4.3 Design Variables 

The diffusion of gold into silicon is a complex process involving 
interstitial-substitutional equilibrium.t® In addition, both diffusion 
constant and solid solubility are partially dependent on the concentra- 
tion of other impurities such as boron and phosphorus.*7 1* Because 
complete data were not available on the entire ranges of interest of 
diffusion temperature or boron and phosphorus concentration, and be- 
cause data were not available on the effect of annealing which would 
necessarily occur during the contact sintering, the effects of gold diffu- 
sion temperature and time and contact sintering time were determined 
empirically. A matrix experiment was performed where one parameter 
was varied, and then another etc., holding the other parameters con- 
stant. 
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Fig. 8—Impedance ratio, Ri = Z(10uA)/Z(50uA), as a function of gold 
diffusion temperature. Diffusion time 10 minutes. Contact sinter time 3 minutes. 


Selected results of the experiments are shown in Figs. 7 through 16. 
Each point represents the average of about 40 diodes. Unless indicated 
otherwise, the gold diffusion temperature was 1300°C, the gold diffu- 
sion time was 10 minutes, and the sintering time was 3 minutes. 

The effect of gold diffusion temperature on Z(50yA), Ri, Re, noise 
voltage, capacitance and forward voltage is shown in Figs. 7 through 
12. Below about 1200°C the gold diffusion has little effect, but at 
higher temperatures the diode parameters depend mainly on the gold 
solubility. Above 1200°C the spread of measurements was also much 
smaller, which indicates that the bulk rather than the surface prop- 
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Fig. 9— Impedance ratio, Re = Z(50uA)/Z(300uA), as a function of gold 
diffusion temperature. Diffusion time 10 minutes. Contact sinter time 3 minutes. 
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Fig. 10— Noise voltage (200-3500 Hz) at 2.5uA forward bias current as a func- 
tion of gold diffusion temperature. Diffusion time 10 minutes. Contact sinter time 
3 minutes. 


erties were dominant. This information resulted in the choice of a high 
gold diffusion temperature of 1300°C. 

The effect of gold diffusion time on Z(50zA), forward voltage and 
noise voltage is shown in Figs. 18 through 15. At times greater than 10 
minutes the forward voltage and noise did not change with time. How- 
ever, the diode impedance and hence the multiplier, n, did change 
which means that an equilibrium condition was not reached. Since 
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Fig. 11— Capacitance at 1 MHz and zero bias as a function of gold diffusion 
temperature. Diffusion time 10 minutes. Contact sinter time 3 minutes. 
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Fig. 12— Forward voltage at 2uA and 50uA bias currents as a function of gold 
diffusion temperature. Diffusion time 10 minutes. Contact sinter time 3 minutes. 


gold is known to precipitate or collect in phosphorus doped silicon 
dioxide’? and at dislocations, as well as to form a complex with phos- 
phorus, equilibrium would not be expected only on the basis that solid 
solubility had been reached. Ten minutes was chosen for the diffusion 
time. 

The importance of contact sintering time can be seen in Fig. 16 
which shows forward impedance, Z(50nA), as a function of sintering 
time. A sintering time of 9.5 minutes was chosen because it corresponds 
to a standard transistor process which results in good contact adher- 


fo) 
oO 
oO 








fo) 
} 
fe) 


wo 
w 
(eo) 








FORWARD IMPEDANCE IN OHMS 





o 
{o) 
{o) 


ie) 5 10 15 20 25 30 
DIFFUSION TIME IN MINUTES 


Fig. 18 — Forward impedance at 50uA bias current as a function of gold diffu- 
sion time. Diffusion temperature 1300°C. Contact sinter time 3 minutes. 
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Fig. 14— Forward voltage at 2uA and 50uA bias current as a function of gold 
diffusion time. Diffusion temperature 1300°C. Contact sinter time 3 minutes. 


ence and because the slope of impedance versus sinter time is low at 
that time. 

Studies were carried out in which the diffusion depth was varied 
from 0.3 to 0.8 mils while holding the gold diffusion to the standard 
conditions. There was no effect on diode parameters. 











TEST SET 
NOISE VOLTAGE 
\ 





NOISE VOLTAGE IN MiCROVOLTS 





5 10 15 20 25 30 35 
DIFFUSION TIME IN MINUTES 


Fig. 15— Noise voltage (200-3500 Hz) at 2.5uA bias current as a function of 
gold diffusion time. Diffusion temperature 1300°C. Contact sinter time 10 minutes. 


1470 THE BELL SYSTEM TECHNICAL JOURNAL, SEPTEMBER 1967 


3.4.4 Electrical Characteristics 

The salient electrical characteristics of the planar diodes namely: 
forward impedance, impedance ratios and impedance differences are 
summarized in Table I. The impedance of a typical unit is shown in 
Fig. 17 in the range of 1pA to 10mA bias. Impedance measurements are 
made at a frequency of 1000 Hz. In the frequency range of interest, 
the capacitance has negligible effect on the impedance. In the worst 
case, at the highest frequency of interest (3500 Hz), and at a forward 
current bias of 10uA, the capacitive reactance is more than two orders 
of magnitude greater than the resistive component. Therefore, the dif- 
ference between the total magnitude of impedance and the resistive 
component is less than 0.01 percent. The dependence of forward im- 
pedance with temperature is shown in Fig. 18 for Z(50nA). The tem- 
perature coefficient of 0.85 ohms/°C is less than would be predicted 
directly from (2) and implies a temperature dependence of the multi- 
plier, n, which has been noted elsewhere.® 

Although no requirements are placed on forward voltage, a plot of 
forward voltage versus forward current for a typical component diode 
is shown in Fig. 19 for completeness. It is, of course, the linearity of 
this semilogarithmic plot which results in the excellent impedance con- 
trol of the new diodes with current. 

The stability requirement placed on the diodes is that the impedance 
value, Z(50uA), should not drift with time; in particular it should not 
drift in the first few minutes of application of bias. The short term 
drift, as has been noted, was a problem with the mesa diodes. No short 
term drift has been detected in the planar diodes by a test system 
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Fig. 16 — Small signal forward impedance at 50uA bias as a function of contact 
sinter time. Nominal sintering temperature was 625°C. 
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Fig. 17 — Typical forward impedance versus bias current at 25°C for compo- 
nent diode. 


capable of detecting a drift of 1 ohm (0.1 percent change). Likewise, 
no long term drift has been detected either. In one life study a sample 
of 24 component diodes showed a drift of less than 0.5 percent (which 
was test set limit) after 4000 hours of greatly accelerated switched 
power aging (I) = 50 mA, Ve (peak) = 5V) at an ambient of 150°C. 
Another important characteristic is the noise generated by the diodes 
in the frequency range 200 Hz to 3500 Hz (C message weighting). The 
noise appears as a hissing sound to the listener when no voice signal 
is present. Measurements are made with 17,000 ohms in parallel with 
the diode or diode pair which is what appears in the actual circuit. 
Since the diode impedance at 2.5 nA is comparable to 17,000 ohms, and 
the noise voltages add as the square root of the sum of the squares, 
the measured noise voltage of two diodes in series is actually less than 
the noise voltage of either diode singly. The circuit requirement was 
less than 20 »V rms for a compressor pair and 40 »V rms for an ex- 
pandor pair. Therefore, by requiring a single diode to have less than 
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Fig. 18— Typical forward impedance at 504A bias current as a function of 
temperature. 


20-»V rms noise, the pairs are guaranteed to meet the 20-zV rms limit. 
Most diodes had noise voltages less than or comparable to the test set 
limit of 2.4 »V rms. A check of 862 diodes produced during the devel- 
opment showed only 1 device to fail the noise limit. 

Measurements made on noisy mesa diodes and other diodes produced 
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Tig. 19 — Forward current versus forward voltage for typical component diode. 
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during the development of the planar diode and the reasons for the 
noise improvement are discussed in the next section. 


3.4.5 Noise Discussion 


The decrease of noise voltage with increasing gold doping can be 
seen in Fig. 10.* Based on measurements made on units with measur- 
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Fig. 20 — Noise voltage squared as a function of forward bias current, Ir, for 
two noisy development diodes. 


able noise, the noise is 1/f noise over the audio frequency range, 1.e.; 


Av; = (const/f) Af 
or 

At, = (const/f) Af, 
where v, = noise voltage, 7, = noise current and Af = small frequency 
range, 


Measurement as in Fig. 20 shows that the dependence of noise volt- 
age on total de current, I, is 


= (const) /IF°. 


* Except for the noise voltages plotted in Fig. 10, which are measured as 
described in the last section, all noise voltages are equivalent open cireuit 
voltages. All noise currents are equivalent short circuit currents. 
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Because of this dependence of noise voltage on forward current, the 
noise limit is specified at the low current of 2.5 pA. 
Note that because 2, = v,/Z; and Z; = nkT/qIr, 


i, = (const)IF*. (12) 


The decrease of noise at a given forward bias current, Jy, with in- 
creasing gold doping may be explained in the following manner. As 
recombination-generation centers are increased, the forward voltage, 
Vr, required to attain a given forward current decreases. If there is 
a secondary current (much less in magnitude than the recombination- 
generation current), which is the noise generating current, and if the 
noise due to this current increases with increasing forward bias volt- 
age, then adding gold decreases the forward voltage for a specified 
current and the decreased forward voltage results in lower noise. This 
secondary current is quite likely associated with surface, bulk or chan- 
nel leakage components or excess tunneling current derived from anom- 
alous intermediate energy states. 

If the above explanation is correct, the noise current measured at a 
specified forward voltage should be the same for various gold doping 
levels. The noise current is compared for a forward voltage of 0.463 
volts for several groups from Fig. 10. The reason for comparing noise 
currents rather than noise voltages will be made clear shortly. Average 
noise voltages from Fig. 10 were corrected for test set noise and for 
the parallel 17,000-ohm resistor and converted to noise current. 


v: (corrected) = v2 (measured) — v2 (set) where v2 (set) = 2.4uV 
V, = V, (open circuit) = v, (corrected) (1.7 X 10* + Z,)/1.7 X 10° 
Tn = v,/Z . 


The V-I characteristics of each group gave the bias current, Iy, for 
Vr = 0.468 volts for that group. The empirical equation (12) was used 
to find the noise current at this new current, since the constant in (12) 
could be found from the measurement at 2.5 pA above. Calculations 
were made for gold diffusion temperatures of 1150, 1200, 1225, and 
1250°C where the greatest change in noise appeared to take place. The 
results, in Table II, are in rather good agreement with the hypothesis 
that the noise current depends only on the voltage Vr. 

The fact that the noise must be described as a current generator 
(rather than a voltage generator) follows logically from a circuit anal- 
ysis of the physical diode. An equivalent circuit for the diode is given 
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Tas_e II — Norse CurRENT AND DC CURRENT AT Vy = 0.463 VoLTs. 


Gold diffusion temperature} 1150°C 1200°C 1225°C 1250°C 
Ip 3.75pA 2.5pA 6.0nA 2ipA 
in (short circuit) 1.05nA 0.95nA 1.038nA 1.36nA 





in Fig. 21(a), where J, is the diode current calculated earlier and I, 
is an excess current. 

An equivalent ac circuit including noise sources is given in Fig. 21(b), 
where x refers to excess current quantities and d to the dominant diode 
current quantities. If it is assumed that J, << Iz, Z, >> Zz and tz >> tng 
the circuit of Fig. 21(c) results. The Thevenin equivalent of Fig. 21(c) 
is shown in Fig. 21(d). The noise voltage, v, , is dependent on quantities 
related to two independent currents. The noise voltage measured at a 
specified voltage, changes with gold doping because Z; (which equals 
nkT/Iy) changes with gold doping, while 7,, remains constant. Thus, 
the noise current is directly related to the noise mechanism, while the 
noise voltage is indirectly related. 

As previously mentioned, there are several candidates for the cur- 
rent which produces the excess noise. It does not appear to be asso- 
ciated with bulk recombination current because gold doping does not 
change it. It could be associated with surface, channel or bulk leakage 
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Fig. 21— Equivalent circuits of diode with noise sources, 
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currents since 1/f noise has been widely reported for these components. 
It could also be the excess current associated with tunneling®® since 1/f 
noise has been reported for this current in germanium.*! While these 
diodes do not exhibit measurable tunnel current, they are near the 
tunnel diode doping levels. 


Iv. CONCLUSIONS 


A new semiconductor compandor diode has been developed in which 
the critical small signal forward impedance characteristics are con- 
trolled by bulk material properties. The heavy gold doping employed 
in this design forces bulk recombination-generation currents to domi- 
nate over all surface, channel and diffusion currents and results in a 
low-noise device with well-controlled electrical characteristics. Oxide 
passivation and very low resistivity semiconductor material combine 
to produce an extremely stable device capable of being manufactured 
at yields governed almost exclusively by assembly workmanship. These 
devices were initially designed for use in the N2 and N38 Carrier Sys- 
tems and have also been incorporated into the 3A Echo Suppressor 
System as a variolosser element. 
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Noise-Like Structure in the Image of 
Diffusely Reflecting Objects in 


Coherent Illumination 


By L. H. ENLOE 
(Manuscript received April 13, 1967) 


Holographic and other imaging systems utilizing coherent light introduce 
a speckled or notse-like pattern in the tmage of a diffuse object which 
severely degrades image quality. It is desirable to understand this effect 
quantitatively. Intelligent design in many cases requires knowledge of 
the mean-square value, spatial power spectral density, and autocorrelation 
junction of the notse-like fluctuations. These quantities have been deter- 
mined for the image of a uniform diffuse object. Major results are: 

(t) The mean-square value of the fluctuation in the image intensity 
is equal to the square of the mean intensity. 

(iz) One can decrease the relative magnitude of the noise-like fluctua- 
tions at the cost of a corresponding increase in the aperture required of 
the optical system (or hologram) over that required to resolve the desired 
image in a spatial frequency sense. In a holographic facsimile or TV 
system, this calls for a corresponding increase in electrical bandwidth. 

(wt) The improvement in (ii) ts not possible for direct viewing with 
the human eye, since the resolution of a healthy eye 1s known to be limited 
by diffraction at the tris. 


I. INTRODUCTION 


Holographic and other imaging systems using coherent light have 
been receiving considerable attention lately.1:?)*4 Most analyses on 
this subject assume that the object reflects specularly, or transmits 
specularly if the object is a transparency, i.e., the reflectivity or 
transmissivity of the object varies smoothly. Most objects, however, 
are more nearly diffuse reflectors. When the image of a diffusely re- 
flecting object is formed it will be covered with a noise or grain-like 
structure® 6 7 which is the speckle pattern which one sees when laser 
light is used to illuminate an object. 
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In this paper we investigate the noise-like or speckled nature of 
the image of a uniform diffuse surface. It should be emphasized that 
we are interested in the properties of the image in contradistinction 
to the direct backscattered field studied by Goldfisher.2 We show 
that the intensity consists of two parts. The first is the mean or en- 
semble average intensity and is proportional to the intensity which 
would be obtained if incoherent light were used for illumination. This 
is the desired component of the image and might be likened to a 
signal. The second part of the image is the speckled or noise-like 
component which tends to obscure the average intensity. This noise- 
like component occurs because of the random phase angles associated 
with the scattering centers comprising the microstructure of the dif- 
fuse surface. The spatial autocorrelation function and power spectral 
density of the speckle pattern in the image are found, and are shown 
to be dependent upon the size of the aperture stop. It is shown that 
the variance of the intensity fluctuation is equal to the square of the 
mean intensity. The fluctuation may be reduced, however, if one is 
willing to sacrifice resolution by recording the image on film whose 
resolution is much poorer than that set by the aperture of the optics. 
Unfortunately, this alternative is not available when viewing with the 
human eye, since the resolution of a healthy eye is known to be de- 
termined by the diffraction limit of the iris. This seems to place 
definite limitations upon the use of coherent light in visual systems. 


II. ARBITRARY APERTURE 


The model which we shall use for a diffuse object is shown in 
Fig. 1. Although the object is shown to be a granular transparency, 
it could equally well have been shown as a reflector without loss of 
generality. The essential point is that a monochromatic coherent light 
wave of unit intensity is assumed to be scattered by a random set of 
point scatterers. Each scatterer is assumed to be a unit scatterer which 
is many wavelengths in depth from its neighbor. The relative phase 
of the wave scattered from each scatterer may be assumed to be a 
random variable which is statistically independent of the phase of the 
waves scattered from other scatterers. Any phase change between 0 
and 27 is equally likely. Multiple scattering will be neglected. 

The scattered field just to the right of the granular transparency 
can be expressed by the equation 


K 
F(x, y) = do oe — 2; ,y — ye”, (1) 
i=1 
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Fig. 1— A uniform wave of coherent light is incident on a transparency com- 
posed of randomly distributed unit point scatterers. Light collected by the 
aperture A, placed in the far-field, is imaged by lens Z on plane P. 


where 6; is the relative phase of the wave scattered from the scatterer 
located at x = 2;, y = Yi. 0, 1%; and y; are assumed to be random vari- 
ables uniformly distributed in the intervals (0,27), (—X,+X) and 
(—Y,+Y), respectively. Notice that because of our assumptions, the 
statistics of the scattered field are independent of any deterministic 
variation in the phase of the illuminating field. 

A Fourier transform relationship exists between the scattered field 
given by (1) and its far-field. The far-field is given by 


+0 pto 
Fg = ff Pole, eee de dy 


ll 


(2) 

K 
is > gitern®) (zi€t+uin) 
t=1 
where we have suppressed the time factor e***’. Notice that each scat- 
terer has produced a plane wave, and that the slope of the phase front 
of each wave with respect to the £, 7 axes is determined by the position 
(x; , y:) of the random scatterer. 

Let the far-field F',(é, 7) be passed through an aperture having an 
amplitude transmission function H(£, 7), and then through a lens which 
is placed a distance z behind the aperture. Since the field at the back 
focal plane of a lens is a Fourier transform like function of the field in 
front of the lens, an image of the granular transparency, as modified 
by the aperture, will be formed in the back focal plane, and is given by” 


Fv.) = re” Ve [. H(E, FE, m@ OO dé da 
aes (3a) 


eres @ Yi 10% 
ray Md? a ue 
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where c = a(z — f)/Af’, and where A(t, wu) and H(é, 7) are Fourier 
transform pairs in the sense 


neu) = ff HG ne de do. (3b) 


Notice that except for the unimportant phase factor <&°°"*°” , (8a) 
differs from (1) for the field at the granular transparency only in that 
h( ) functions have replaced the delta functions. That is to say, the 
delta function of light from the scatterer at (x; , y;) is reproduced as a 
broadened h( ) function located at » = —(f/d)z;, » = —(f/d)y;. 
The image is reversed, and magnified by the factor m = f/d. Notice 
that because of the random phase 6; of each, the impulse functions will 
add vectorially in a random fashion when they overlap one another. 
The situation is analogous to passing shot noise impulses through 
a low-pass filter having an impulse response h( ). The impulses are 
broadened into h() pulses whose width depends inversely upon the 
filter bandwidth. In the coherent light case, however, the process is 
two dimensional and the applied impulses have random phase angles 
distributed uniformly between 0 and 2z, rather than being constrained 
to be positive impulse functions as is the case for shot noise. 
The quantity of greatest interest to us is the intensity of the image, 
which is found by multiplying the image field by its conjugate. 
Iv, w) = FQ, w) FQ, «) (4) 
= ** Ti WO Yi) 5064-05) 
» Daly xd? TD ve) a(S ui). 


al dd’ Af 

The uniform diffuse object 1s assumed to exist in the region — X S 
eS +X, -Y Sy S+Y. The number K of point scatterers in this 
region is a2 random variable, as are their positions (7, y;) and their 
relative phase angles 6;. We may, therefore, obtain the ensemble 
average of the image intensity I by averaging (4) with respect to the 
2K + 1 random variables consisting of the K positions (x, y;), K 
phase angles 6,, and K itself: 


Bagel 
[erg rage 


“Wes 5 Yr 52, Ya50e* Bey Yn 5 O15 °° Ox ; K) 
-d2, dy; 2 7 dix dyx dé, pone dx dK, (5) 
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where W() is the multi-dimensional probability density function. 

Now the positions (x;, y;) are considered to be statistically inde- 
pendent variables, as are the relative phase angles 6;. They are also 
independent of K, so we may simplify (5) to obtain 


i if W(K) dK 


K K +X +X +Y 2r 27 
dit, dix [°° dyx f "dh ||| f* déx 
be aa aX Le Ce OY de. De Qe 


k=1 i=1 0 


fon sa.8+eoG+e¢+ih 


We see that the above expression vanishes unless 6; = 6,, 1.¢, 7 = k. 
Further, all of the h( ) functions have the same shape so that if the 
size of a resolution element in the image is small compared to the 
field of view, ie., the extent of h(v/\f, w/Af) is small compared to X 
and Y, then we may replace the limits of integration +X and +Y by 
+ « to obtain 


d’y’ p,(0, 0) 
4X Y 


pi (u, Vv) is the autocorrelation function of the aperture impulse 
function h(é, y), ie., 


ro [ ” KW(K) dK. (7) 


piu, v) = i - h*(t, r)A(t + u, 7 + 0) dt dr % 
aah Wee P 


= a i He, n)H*(E, ner rae dé dn. 


If we now assume that the number of scatterers per unit area of the 
transparency has a Poisson distribution of mean N, then the mean 
intensity is 
I = @NNp,(0, 0). (9) 
Next we wish to determine an expression for the autocorrelation 
function of the intensity, from which we may determine the spatial 
power spectral density and variance of the noise-like fluctuations. The 
autocorrelation function of the intensity as given by (4) is 


R(r, ) = 10,010 + 7,04 9 (10) 


[oe fe | ale + +, 24 Ben | 
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Because of the statistical independence of the phase angles 6;, posi- 
tions (x;, y;) and K, and because of the assumed uniform distribution, 


we may simplify (10) to 


R(r, t) = [ W(K) dK 
K K K K +00 +00 +0 
dx, fo" dy |, dix f°" dx 
DipI 2p ee oY ogo OX J OY 
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We see that the integral vanishes unless 

t=k and n=m 
or 

n=k and 1=Mm, 


which gives 


R(r, ) = a . W(K) dK 


K 
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Now, we have two subcases here. There are K(JC—1) terms for which 
k 4m, and there are K terms for which k = m. 


RG) = he. K(K — 1)W(K) dK 


{late [oI te +5244) [aay| 
Hwee we(% +2844) 


v+to x wtt u.) : 
af er a? oy Py age \ 














42 i KW(K) dK 


Var ee ate +9 ath u) 


Jae 2 ott, oyl ay a | (13) 


Assuming that the distribution of scatterers W() is Poisson and 
using the definition of h () given in (3b), straightforward evalua- 
tion of the integrals in (13) yields 














= | er(r/fr, t/fr) P I 
Rr, t) =I |i + (0, 0) fy] + 2 p,(0, 0) p2(7/fr, t/fr), (14) 


where p; (u, v) is defined in (8) and 


alu, = ff (ie, DP [Wer +, t-+0) Pde a 


= autocorrelation function of the magnitude squared of 
the aperture impulse function. 


The spatial power spectral density is found by taking the Fourier 
transform of (14). After simplification we obtain 


S(q,p) = i / Rr, here dr dt 


= P\ aa p) + 0 OF | HOSa, Mp) |’ ® | HOfa, Mp) 


T 30, OF | HOfq, Mp) ® Ha, Np) r} (15) 
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where @ stands for convolution. In particular we define 
FOfq, Mp) ® GOfa, Mp) 


=< ‘ae i. F*(a, WGQfq + 2, ie + y)dydx. (16) 


Equation (9), which gives the mean intensity of the image, and 
(15), which gives the power spectral density of the intensity fluctua- 
tions, are the major results of this section. 


HI. CIRCULAR APERTURE 


Now consider the special case of a circular aperture of radius 1, 
and let it be located on axis so that 


1, PST. 
He, n) — { r > Ye ; (17) 
where 
r= +VP +7. 
The average intensity in the image plane is given by (9) and is 
I = d&\Np,(0, 0) = rN (Adr.)’, (18) 
where p1(0,0) was evaluated from the integral 
+0 pto 
00,0 =f | |HG Pagan = mt. (19) 


Evaluation of the integrals in (15) gives the power spectral density 


S(q, p) = | a0, p) + = " — sin” (32) te (se) ~ (2) 
+ p(t Jen (GE) -2 Ei -(E))}} 


q, p = image plane spatial frequencies in rectangular coordinates, 

+V¢+p° 

s, = r,/f\ = cutoff frequency produced by diffraction at the circular 
aperture. 

iN 


2 
2Qar. 











where 
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F= 


= overlap factor. 
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The overlap factor F warrants some discussion. Basically it is equal 
to the average number of point scatterer image centers contained 
within an area equal to that occupied by the image of a single 
scatterer. That is, a single point scatterer located at (0,0) in the 
object plane would produce a point in the image plane at (0,0) 


having intensity 
Pie ws 4 
ea ale ; “) 


g(t VF a] 
a fess | 


ete eo 
pr Vy + a 


2 











The intensity is down’ approximately 50 percent at (2mr./fd) 
Vy + w = V2, and the area covered by the image of the point 
scatterer at this 50 percent value is A = r(v? + w?) = f?\?/2ar?. For 
a diffuse object, the average number of imaged scattering centers per 
unit area in the image plane is 7 = (d/f)’N. If we define the overlap 
factor F as the average number of scatterer image centers falling in the 
area of one of these images we have 
NN 


a 
Qrr 


F=fA = 





For a truly diffuse surface, the overlap factor F > 1 so that (20) re- 
duces to 


2 1 ee 
Sq, p) = 1 | aa Pp) + 3 {1 eo (2) 


22) -(2)}] ew 


which is plotted in Fig. 2. Note that it is symmetrical about the 
vertical axis. For very small spatial frequencies, (21) can be approxi- 
mated by 








S(q, p) = P| ata p) + a (22) 


The total fluctuation or noise power occurring in spatial frequencies 
less than some frequency s, is 
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IV. CONCLUSIONS 


We have found that the image of a uniform diffuse object illumi- 
nated with monochromatic coherent light consists of two parts. The 
first is the mean or ensemble average given by (18), for a circular 
aperture, and is proportional to the intensity which would be obtained 
if noncoherent light were used for illumination. This is the desired 
component and might be likened to the signal component of the 
image. The square of this term appears as the first term in (20), 
(21), and (22), and as the delta function in Fig. 2. The second part 
of the image is a grainy or noise-like component which tends to 
obscure the mean intensity or signal. This noise-like component 
occurs because of the random phase angles associated with the point 
scatterers comprising the microstructure of the diffuse object. This 
component is shown as the second term in (20), (21), and (22), and 
as the continuous part of the power spectrum in Fig. 2. Integration of 
(21) shows that the variance of the noise-like fluctuations in the 
intensity is equal to the square of the mean intensity (or to the signal 
power). This is fortunate to the extent that when the signal is small, 
the noise is likewise small. However, while our analysis was for the 
particular case of a uniform diffuse surface, we can safely predict 
that for nonuniform diffuse objects fine detail in the image will be 
largely obscured by the noise-like fluctuations if resolution is limited 
by diffraction. 

The noise-like fluctuations in the image can be reduced if one 
records the image on film whose modulation transfer function has a 
bandwidth which is much smaller than the diffraction limit of the 
optical system. The high-frequency noise in Fig. 2 will not be resolved 
in this case. For instance, if one requires the “signal-to-noise” ratio 
to be increased from unity to 10? (30 dB), then from (23) we see that 


eg 77 6(q,p)+1-2 sin'(32.) -2 (8) i-(52:) 


{.0 





Fig. 2—Section of the spatial power spectral density for a uniform diffuse 
surface imaged through a circular aperture. The complete two dimensional 
spectrum is obtained by rotating the above curve about the vertical axis. 
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the diffraction bandwidth s, of the improved optical system must be 
10° = 31.6 times the bandwidth s, resolvable by the film, and there- 
fore by the whole system. (Since most transducers produce a signal 
which is proportional to the intensity of the incident light, it seems 
appropriate to consider the square of the mean intensity as signal 
power and the variance of the intensity fluctuations as noise power.) 

Although we have analyzed the very special optical system shown 
in Fig. 1, our results are not critically dependent upon the placement 
of the aperture. The aperture could be the lens aperture rather than 
an independent physical device, or it could be the aperture defined 
by the finite size of a hologram, for instance. Our results should also 
hold approximately for the human eye, since the resolution of a 
healthy eye is known to be determined by the diffraction limit of the 
iris. The predicted value of unity for the signal-to-noise ratio is the 
right order of magnitude for what one observes when laser light is 
used for illumination if one is careful to hold the eye stationary and 
hence not average the noise out as a function of time. Although 
moving the eye tends to average out the noise, the residual noisiness 
remains objectionable. This places definite limitations upon the use 
of coherent light in visual systems. 


The author wishes to thank Messrs. C. B. Rubinstein and A. B. 
Larsen for helpful discussions. 
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The Excitation of Planar Dielectric 
Waveguides at p-n Junctions, | 


By J. McKENNA 
(Manuscript received April 26, 1967) 


The fields excited within a planar dielectric waveguide by an externally 
incident electromagnetic field are studied in this paper. The dielectric 
waveguide fills the half space z > O, while the half space z < O ts air. 
The waveguide is formed by a nonuniform, anisotropic, nonabsorbing, 
dielectric medium. Different choices of the dielectric tensor for this medium 
yield different waveguides. Certain models which are particularly relevant 
to electro-optic diode waveguides and laser diode amplifiers are studied 
in some detail. An arbitary incident field will, in general, excite not only 
a finite number of propagating modes, but also a background of continuum 
modes. Integral representations of the total transmitted field within the 
waveguide as well as of the reflected field are obtained. The representation 
of the total transmitted field can be decomposed into a finite sum of discrete 
propagating modes, a continuum propagating field, and an evanescent 
field. Explicit evaluation of the fields depends on the solution of a patr 
of integral equations. In practice, the dielectric tensor of the waveguide 
differs but little from the dielectric constant of the surrounding material. 
An approximate solution is found for this case, and numerical results 
will appear in a following paper. 


I. INTRODUCTION 


Recently there has been great interest in the guiding of light by the 
p-n junction region in certain piezoelectric semiconductors, for it has 
been noted that the Pockels effect due to the electric field within the 
p-n junction can be used to modulate light which propagates parallel 
to the junction plane.++ This effect was first observed, and has been 
most intensively studied, with visible light in GaP junctions,’ but it 
has also been observed with infrared light in GaAs junctions." * 

All treatments of the effect so far have assumed that the p-n junc- 
tion region, which has a higher dielectric constant than the surround- 
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ing, normal GaP, behaves like a dielectric waveguide.t® A detailed 
analysis of this waveguide would require a knowledge of the optical 
properties in the neighborhood of the junction. However, since these 
properties change significantly in a fraction of a wavelength, it is ex- 
tremely difficult to investigate them individually by experimental 
means. In order to get around this difficulty it has been necessary to 
adopt an indirect approach based on analyzing a number of different 
mathematical models and comparing their predictions with experiment. 

As part of this program Nelson and McKenna‘ have investigated 
the possible discrete modes which can propagate in a number of dif- 
ferent models and have studied in considerable detail the properties of 
the lowest-order mode of each polarization. Recent experimental work 
has made it increasingly clear, however, that a knowledge of the dis- 
crete modes alone is not enough to provide an understanding of these 
p-n junction dielectric waveguides. This is because a beam of light, 
when focused on the face of a junction waveguide, excites within the 
waveguide not only a finite number of discrete modes, but also a back- 
ground of continuum modes. In many cases this background light is 
intense enough to mask important features of the discrete propagat- 
ing modes. Thus, unless an understanding of this background light is 
available, the task of comparing the predictions of different mathe- 
matical models with experiment is almost impossible. An understand- 
ing of the electromagnetic boundary value problem involved also has 
great relevance to understanding what happens when light is intro- 
duced into a laser diode amplifier. 

The purpose of this paper is to study in some detail a class of math- 
ematical models of the excitation of dielectric waveguides. These mod- 
els are simple enough so that the mathematical analysis can be per- 
formed and the background light can be investigated carefully. At the 
same time, it is felt that the models are realistic enough so that their 
predictions can be compared with experiment. 

The models can be described as follows. The waveguide consists of 
the half space z > 0, as shown in Fig. 1, while the region z < 0 is air. 
The waveguide itself is assumed to be formed by a nonuniform, aniso- 
tropic, nonabsorbing dielectric. The components of the dielectric ten- 
sor are functions of the coordinate x only, and for each value of x the 
dielectric tensor is diagonal in the fixed coordinate system shown in 
Fig. 1. As an example, for the GaP electro-optic diode modulator stud- 
ied in NM this corresponds to the cases where the junction field is in 
the [111] or [100] directions. Hach such model is determined by its 
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Fig. 1—Symmetric step model illustrating the coordinate system used in all 
the models. The dielectric tensor is always diagonal in this fixed coordinate sys- 
tem. 


dielectric tensor whose diagonal elements we will denote by K, (2) 
(n=, y, 2). 

It was shown in NM that the amount of absorption encountered in 
GaP electro-optic diode modulators was too small to affect significantly 
the shape of the modes. It is, therefore, felt that the study of absorp- 
tionless models here is well justified. It was also shown in NM that 
the detailed analytical form of the functions K,(x) is not important 
when only the lowest-order discrete mode of each polarization can 
propagate. The most important features of the discrete modes can be 
determined by studying models for which the functions K,(x) are 
step functions (piece-wise constant). Although it is possible to carry 
out a good deal of the analysis without specifying the functions K, (~), 
the final detailed results naturally depend on the choice of K,(xz). We 
shall concentrate here on two models, the symmetric step model and 
the asymmetric step model. The symmetric step model is defined by 
the equations® 


K,(z) =K,, |x|<w (1) 
= Ky ’ | w | > w (2) 
and the asymmetric step model is defined by the equations® 


K,(@) =K,n, |zxl<w (3) 
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= K,, z<—w (4) 
= Ks, x> w, (5) 
where Ky < Ky, and Ky, > K; 21, m =, y, 2,7 = 0, 1,2 (see Fig. 


2). In the case of the GaP electro-optic diode modulators there are 
relations of the form® 


K, =n7(1 + 4,), (m = 2, Y, 2) (6) 

Ko = n'(1 = A), K; = n(1 ys Aj), Gj = Li; 2). (7) 
In (6) and (7) 7 is the index of refraction of normal GaP, the quan- 
tities 8,, are linear in the junction field (the linear electro-optic effect), 
and 0 = |8,|<A<1. 

In Section II we will write down general integral representations for 
incident waves in the region z < 0, as well as integral representations 
for the resulting reflected and transmitted fields. These integral rep- 
resentations will involve a number of unknown functions. Some of 
these functions are determined directly from the structure of the wave- 
guide and are independent of the incident field and the boundary con- 
dition at z = 0. The remaining unknown functions are determined by 
the incident field and the boundary conditions at z = 0. We show that 
these functions satisfy a set of linear integral equations. The results 
of Section II are independent of the specific form of K,,(x) and the 
incident field. In Section III we explicitly calculate the unknown func- 
tions which depend only on the structure of the waveguide for the 
symmetric and asymmetric step models. In Section IV we obtain ap- 
proximate solutions of the integral equations for a special class of 


Km (&) 





-W —W 


Fig. 2—-(a) The function Km(x) for the symmetric step model. (b) The 
function Km(zx) for the asymmetric step model, 
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waveguide models. The remaining unknown functions are determined 
for these models in terms of the incident field. In a second paper on 
this subject we will give asymptotic expansions and numerical results 
for the fields within the waveguide for the symmetric and asymmetric 
models when they are excited by a Gaussian incident wave. 


II. A GENERAL DESCRIPTION OF THE FIELDS 


In this section we study formal solutions of Maxwell’s equations 
which describe an incident wave in the region z < 0 moving to the 
right and striking the waveguide from the left, a reflected wave in the 
region z < 0, and a transmitted wave in the region z > 0. The fields 
are assumed to be monochromatic and independent of the coordinate 
y. We write for the total electric and magnetic field vectors 


E(x, 2, ) =Re(e(x, ae), (wv, 2, 1) = Re (aw, ae"), 8) 
and for the total electric displacement and magnetic induction vectors 
D(a, z, t) = Re d@,2e""'), —- B(x, 2, t) = Re (b(z, ze"*'), (9) 


where Re denotes the real part and w = 2rf is the angular frequency 
of the radiation. Then Maxwell’s equations are 


V Xe = —wh, V-d = 0, 
V X+h = tod, V-b = 0. 


(10) 


From our assumptions about the model, the constitutive equations can 
be written as 


b=uh, d=«K-e, (11) 


where ¢€ 9 and yo are, respectively, the permittivity and permeability 
of free space. The dielectric matrix K = K(z, z) is the unit matrix 
for z < 0, and for z > 0 it is a diagonal matrix whose diagonal elements, 
K,(a) (n = 2, y, 2), are functions of x only. It is a straightforward 
matter to show that any solution of Maxwell’s equations satisfying 
the above assumptions can be written as the linear combination of a 
TE solution and a T'M solution. We consider these solutions separately. 


2.1 TE Fields 
We first look for TE solutions having the form 


e(x, 2) = [0,¢,(z, 2), 0], h(a, 2) = [h.(z, 2), 0, h(a, 2)]. (12) 
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In the region z < 0, e, must satisfy the Helmholtz equation 





are, , de, 
Ox? Oe ae key = 0, (13) 


where the free-space wavenumber k is defined by 


k= ww(€ofto)? = Qr/n 


and 2 is the free-space wavelength. The total field in z < 0 is the sum 
of the incident field eS” and the reflected field e{” and both e{” and e}” 
are solutions of (13). In the region z > 0 there is only the transmitted 
field which satisfies the equation 


de, 
az” 
A solution of (13), which can be found by separation of variables, 


and which describes a general incident field due to sources in z < 0 
ata finite distance from the plane z = 0, is 


0’e, 4 4 
Oe + kK (ae, = 0. (14) 


P(e, 2) = = [86] exp {-i0()z — ile} al, au 


where 
2) =+Vkh — I, ll| sk 
=-iV?—k, |ll ak. 
The components of the magnetic field vector can be obtained with the 
aid of Maxwell’s equations by differentiating (15). Let (zo) denote 
the strip —0o <%< 0,0 Sy 11, lying in the plane z = 2. Then 


the time averaged power incident on 3(z), 2 S 0, is independent of z 
and is 


(16) 


P,; = —3Re | es (x, 2)he? (a, 2)* dx 


k pre ot ; 
soy dipiaev! / Ve EF |e) dl, (7) 
—k 
where * denotes complex conjugation. We will assume that 
[ lee@ra<o and f [aM [em pa<e. 


(15) is to describe an incident field due to sources at zg = — , then 
it is easy to see that we must have &{” (1) = 0, || > &. Since the incident 
field must be specified, it will always be assumed that &{" (1) is known. 
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A solution of (13) describing a general wave reflected from the wave- 
guide surface z = 0 is 


Ral em 5 i BU) exp (20a alata: (18) 


We will always assume that the source of the incident radiation is 
perfectly absorbing so that eS (x, z) + ef (a, z) is the total field in 
the region between the source and the surface of the waveguide at 
z = 0. It will be seen that because of the boundary conditions at z = 0, 
&? (1) generally does not vanish outside some finite | interval. Because 
of the factor exp {7Q()z}, that part of the integral in (18) between 
the limits —k and k, f*, { } dl, represents a traveling field, while 
the remainder of the integral represents an evanescent field which 
damps out very rapidly in the negative z direction. The time averaged 
power reflected back through the strip =(z), z S 0, is 


k $$ 
P, = (Arey) / Vi? — P| si) P dl. (19) 
-k 


We now turn to the transmitted field. We use the method of separa- 
tion of variables, and we seek transmitted waves which are linear 
superpositions of solutions of (14) of the form 


ef (x, 2) & e,(z) exp {—i-V/ —rz}. (20) 


In (20) v is a real separation parameter, and if »y > 0, V—v = —ivv. 
If (20) is substituted into (14) we get the eigenvalue equation 
de, 
Ce 

Equation (21) defines a singular, self-adjoint, second-order boun- 
dary value problem on the interval — o < x < o. The theory of this 
equation is well known, and we refer the reader to Coddington and 
Levinson’ for a detailed treatment. We give a summary here of those 
properties of such equations which we will need. 

Tor all the models under consideration, the functions K,,(x) are 
positive, bounded functions, which are bounded away from zero, and 
which are differentiable except for at most a finite number of step dis- 
continuities. Equation (21), therefore, defines a problem which is called 
limit-point type at both plus and minus infinity. This means that for 
arbitrary, complex v, (21) possesses exactly one solution (up to a con- 
stant factor) which is square integrable over 0 < « < ow, and exactly 
one solution which is square integrable over —ao < x < 0. 





+ (k’K,(x) + ve, = 0. (21) 
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For a given real number », let ¢,(«, v) and ¢2(z, v) be the two solutions 
of (21) which satisfy the conditions that 9;(z, v) and ¢/(a, v) be con- 
tinuous and which satisfy the initial conditions 


¢(0,v) = 1, ¢i(0, v) = 0, (22) 
(0, v) = 0, ¢3(0, v) =; (23) 


where ’ = d/dx. Equation (21) also determines a 2 X 2 matrix-valued 
function p(v), —«° < »v < o, having the following properties: (2) p(v) 
is Hermitian (p;,(v) = p(v)). (¢2) p(v) — p(u) is positive semidefinite 
if »y > pw. (a7) p;x,(v) is of bounded variation on every finite interval. 
The matrix p(y) is called the spectral density matrix and its construction 
is outlined in Section III. Then if f(x) is any square integrable function 
([2..| f(x) |? dx < ©), we can define two transforms of f(x), g;(v) G=1, 2), 
such that 


ee) 2 L 
lim / {040 ee / f(x)e;(x, ») ax} 
Low V—0 7,k=1 -L 
L * 
Aa) aa iz f(x)ex(x, v) ax} dp;(v) = 0. (24a) 
This is referred to as convergence in the mean with respect to the 


measure p(y), and in the manner of Fourier transforms of £’ functions, 
we write 


90) = | f@ei(,) dz G = 1,2). (24b) 
In terms of these transforms, the Parseval equality 
[l1@ Par = Sf ey) done), (25) 
and the expansion 
2 ae) 
fa) = Df esle, dont) don) (26) 


are valid. Equation (26) is defined in terms of convergence in the 
mean. The set of real points v at which the functions p;,(v) are noncon- 
stant is the spectrum of (21). The set of points where any pjx(v) is 
discontinuous is the point spectrum and for each such value of v, (21) 
has exactly one square integrable solution. The continuous spectrum 
is the set of points of continuity of p(v) which are in the spectrum. In 
Section III we will exhibit the spectral density matrices for two im- 
portant models. 
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We can now write down a formal expression for the transmitted 
field: 


0S(e,2) = Df exp [iV rhele, Dou) donb). 20 


The two initial value solutions g;(a, v) (j = 1, 2), as well as the func- 
tions pj:(v) (j, k = 1, 2) are determined, independently of the bound- 
ary conditions at zg = 0, by (21) and we can assume that they are 
known. The two unknown functions g;(v) (7 = 1, 2) in (27) are de- 
termined by the field at z = 0, since with the aid of (24) we can write 


nv) = [eS Ce, Dero, ») de. (28) 


It is clear that because of the factor exp {—i+/—pz}, the parts of the 
integrals f°., in (27) represent the propagating portion of the trans- 
mitted field, while the parts ff? reprent the evanescent portion of the 
transmitted field. With the aid of the Parseval relation, (25), we can 
write down an expression for the time averaged power transmitted 
across any D(z), 2 = 0, 


0 _———— 
Pr = Cow) Xf V=r9; 0) don). 9) 

We now make use of the conditions that e,(x, 2) and h,(x, z) must 
be continuous at z2 = 0 in order to write down a set of integral equations 
which determines &;"? (1), g:(v), and ga(). 


x | 8 O + PO} a= YO | ee, rab) done), (80) 
om {IEP D — BDI at 


2 fe) 
= Df Ve nO don). G1) 
Although there appear to be only two equations in three unknown 
functions, because of (24) and (26), (30) and (31) are sufficient to 
determine the unknown functions. We indicate formally why this is 
true, although it will be clear from the results of Section IV that this 
scheme must be modified in specific cases. We do not go into these 
modifications, because in Section IV we use a different scheme to get 
approximate solutions. With the aid of (24b), solve (30) and (81) for 
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g;(v), giving the four equations 
giv) = | ¢;(a, v) dx [ [e+ s,?(]e*” dl, (32) 


V=19,0) = [ ele) dose [ ais — 8P@Ie al, 3) 


for j7 = 1, 2. On eliminating g,(v) between these equations we get the 
two equations in the unknown 6°" (I). 


/ "Binds = : * (n/m + ADE er"* dl, 
=n Rees (34) 


= [- go; (x, v) dx oe 2 (— AS my + a(l))ss? (Dewi dl, 


for 7 = 1, 2. Now from (26) we can write f(x) = f,(v) + fe(x) where 
fe) = DP olen) donb) = 1,2), BS) 


[ t@dese,) de = bn90) GB =1,2, G6) 


and 6;, is the Kronecker delta function. It is this decomposition of 
an arbitrary f(x) into components lying in the two subspaces spanned 
by g(a, v) and ¢,(2, v) which is reflected in the two integral equations 
(34). The solution of (84) with given j yields the component of the 
reflected field lying in the subspace spanned by the corresponding 
¢;(x, v). Let 8)? (1), 7 = 1, 2, denote the two solutions. Then 8{” (1) = 
SO) + &i?() describes the total reflected field. With this result 
gi(v) G = 1, 2) can be obtained from either (32) or (83). We have been 
unable to obtain exact solutions for the integral equations (80)-(31) 
for any of the models considered here. However, in Section IV approx- 
imate solutions are obtained for certain situations of interest. 


2.2 TM Fields 
We next seek TM solutions of Maxwell’s equations of the form 


e(x,z) = (¢,(x, 2), 0,e.(x,z)),  h(w,z) = (0, h,(, 2), 0). (87) 
In the region z < 0, hy, must satisfy (13). In the region z > 0, hy, must 
satisfy the equation 


2 Sa/K.(a) aha} + 8 Sak.) aha} + ky, = 0. (38) 
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Just as for the TE fields, a general incident field due to sources in 
z <O ata finite distance from the plane z = 0 is 


hy (a, 2) = e 5e,() exp {-1Q(D2 — ila} dl. (39) 


The time averaged power due to this wave which is incident on 2(z), 
z250,1s 


a k 
P; = 4Re | es (a, a)hy (a, 2)* dx = (Ameo) | (1) | 5ey?(Z) |? dl. 
3 =k 


(40) 
We assume that f2., | 3¢§ (1) |? dl < © and f%,, | A() |] Hf?) |P dl < @. 
As for the TE field if the sources of the 7M field are at z = —o 


then 3e{? (1) = 0, |2| > &. Furthermore, it will always be assumed 
that 3c‘ (1) is known. 
A solution of (13) describing a general reflected wave is 


ho Gay = = 5e(1) exp {iQ(Dz — ilx} dl. (41) 


Just as in the case of the TE field, h{” (x, z) can be split into a prop- 
agating field and an evanescent field. The time averaged power re- 
flected back through the strip D(z), 2 S 0, is 


P, = (dws) [1 [500 Pal. 


The transmitted field is again treated by separation of variables, 
and we write 
hy (a, 2) & h,(z) exp {-—iV —ve}. 


Then h,(x) satisfies the eigenvalue equation 


Ke) © {(/K.(a) a eK @ ean, = 0: (43) 


Equation (43) is not in the canonical form of a self-adjoint boundary 
value problem. However, if we make the change of variables 





w= f (K.@\"* at, (44) 
0 
(43) is transformed to the equation 


‘ Eegocom 2, | Zs (°K (u) + vh, = 0. (45) 
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This equation defines a self-adjoint boundary value problem,” and 
even though the function {K,(u)K,(u)}~* may have step discontinuities, 
the techniques of Ref. 7 can be shown to be still valid. Equation (45) 
is limit-point at u = +, and so on transforming back to the variable 
x, the following statements can be made. 

Tor a given real number », let y,(z, v) and y.(z, v) be the two solu- 
tions of (48) which satisfy the requirements that 


vi(z,v) and {K,(x)}""yi(a, ») 
be continuous for all x, and which satisfy the initial conditions 
¥i0,»)=1, (1/K.(0))¥i0, ») = 0, (46) 
¥20,¥)=1, (1/K,(0))¥20, ») = 1. (47) 
Equation (48) determines a 2 X 2 spectral density matrix o(v) whose 


construction is given in Section III. If f(x) is any square integrable 
function of x, we define two transforms of f(x), 


ho) = [ IOV AK@\ de G=12), 8) 


where equality in (48) is defined in terms of convergence in the mean 
with respect to the measure o(v). In terms of these transforms, the 
Parseval equality 


[i @ FP K@) ae = FY [nO done), — 49) 


and the expansion 


ja) = YJ vale, wale) do6). (50 


are valid. The last equality is again defined in the sense of convergence 
in the mean. 
We can write down a formal expression for the transmitted field 


m(@,2) = Of exp (-iV =v ah vale, iil) done). (61) 
1,k=1 0 é 

The two initial value solutions y;(z, v) (7 = 1, 2), as well as the func- 

tions oj,(v) (7, k = 1, 2) are determined, independently of the bound- 

ary conditions at z = 0, by (43) and we can assume that they are 

known. The two unknown functions h;(v) (j = 1, 2) in (51) are de- 

termined by the field at z = 0 since with the aid of (48) we can write 
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hig) = f WC, Ove, MK)" ade. (52) 


With the aid of the Parseval relation, (49), we can write down an ex- 


pression for the time averaged power transmitted across any 3(2), 
z= 0: 


2 is) 
P= Gia i WSOP) deAO. (53) 
7,k=1 ¥-0 
We can now make use of the conditions that e,(z, z) and h,(a, 2) 
must be continuous at z¢ = 0 in order to write down a set of integral 
equations from which 3¢{” (1), h,(v), and h(v) can be determined. 


se [toes + sere a = Of vse, Vue) dol), (64) 
ae f motsesy = ses Myer a 


= DVR.) [ V=> vile, mG) den). (68) 


1,k=1 


Just as in the case of the TE field, the solution of (54) and (55) re- 
duces to the solution of the two integral equations 


[ v(x, v) dx = ie { \/ —v/K,(z) + a(D 30? (NeW dl 


eo 1 ao 
=| ve,» des- [ (-V-/K.@) + 20} 
-se(De** dl, GG = 1,2). (56) 
Ill. THE SPECTRAL DENSITY MATRIX FOR SEVERAL MODELS 


3.1 General Outline of the Construction 


In Section II it was shown that the determination of the transmitted 
field for a given model depended on a knowledge of the initial value 
solutions ;(z, v) and y,(x, v) (j = 1, 2) and the spectral density ma- 
trices p(v) and o(v). In this section we study these functions in some 
detail for two simple but important models, the symmetric step model 
and the asymmetric step model. These calculations illustrate the 
technique for treating the whole class of piecewise constant models. 

We first outline the general construction of the spectral density 
matrices.’ The solutions of (21) have the property that the functions 
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¢;(x, v), v(x, v) (7 = 1, 2) are entire functions of v for each fixed 2, 
when » is a complex variable. The first step is to determine the two 
functions of v, m.(v) and m_.(v) such that when Im » > 0, 9,(a, v) + 
Ma(v)¢e2(x, v) is a square integrable function of x over [0, ] and 
¢,(2, v) + m_.a(v)¢.(x, v) is square integrable over [— ©, 0]. The ele- 
ments of the spectral density matrix are then given by the formula 


ah ae . 
pin) — piu) = Tim = / Taw MG ete) as (57) 
e>+0 B 


where p» and vy are real, Im denotes the imaginary part, and for arbi- 
trary complex v 


Myr) = (m-2(v) — mao(r))*; (58) 
Mir) = Mav) = 3(m-ev) + ma(v))(m-n) — mao(r))", — (59) 
Max(%) = m_o(v)mao(r)(m-a() — mMa(v))". (60) 


Equation (57) defines p;,(v) uniquely at points of continuity up to 
an arbitrary, additive constant. The functions M;,() G, k = 1, 2) 
are meromorphic if Im »y ¥ 0 and all their real poles are simple. The 
point spectrum consists exactly of the points which are real poles 
of one of the M/;,(v). There are at most a countable number of such 
points. Let ») be a real pole of 1/7;,(v) and let a;, be the residue there, 
Qik 
vy— Vo 





M;.(r) = si oan (61) 


Then it follows from (57) and (61) that 


pio + 0) — pylvo — 0) = —Re (aj). (62) 


If vo is not a pole of any M;,(v), and Im M;,(»)) ¥ 0 for some (Gj, k), 
then v is a point of the continuous spectrum and 


1 
dp;x(Vo) = pi Im M j(%). (63) 
If vo is not a pole of any M;j,(v) and Im M;j,(v) = 0 for all (j, k) mm 
some neighborhood of vo, then vp is not in the spectrum and 
dpxv) =O Gj, k = 1, 2) (64) 
in a neighborhood of vp. 


3.2 TH Fields for Symmetric Step Model 


We now apply these formulas to the symmetric step model for the 
case of the TF field. The functions K,(x) (n = 2, y, 2) are defined by 
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(1 and 2). Equation (21) has constant coefficients in the two regions 
|x] < w and |z| > w. Since e,(x, 2) and h(x, 2) must be continuous 
at x = -tw, the desired solution of (21) must be continuous and have 
a continuous derivative. We have 


gi(x, v) = cos &,2), le| sw (65) 
= cos (w,w) cos {wo(| z | — w)} 

— (w,/e») sin @yw) sin {wo(|z|—w)}, |x| 2w — 6) 

g(x, v) = (1/w,) sin @,2), |2z| sw (67) 


(1/w,) sin @,w) cos {w(x — w)} 


+ (1/w9) cos (ww) sin {wo(a — w)}, raw (68) 
go(x, v) = —(—2, ); zS—-w (69) 
where 

wo, = y+ RK}? Mm =0,2, 4). (70) 


In (70) o, is defined as a single-valued function of v in the complex 
plane cut along the real axis from —k?K, to o. That branch is chosen 
which is positive real on the upper side of the cut. Simple calculations 
now yield 


Ma(v) = —m_p(v) 
= {w, sin (ww) + two cos (w,w)} {cos (ww) — t(wo/w,) sin (ww) }*. 
(71) 
Therefore, 
M,i,0) = —1/{4M,.0)} = 1/{2m_.0)}, (72) 
Mov) = M2,) = 0. (73) 


In order to determine the spectrum, we begin by decomposing the 
whole real axis into the union of three intervals 


I, = [-, —#’K,], I, = (—k’K, , —k’Ky), I; = [-k’Ky , ©]. (74) 


From (57) and (73) it is clear that pio(v) and poi(v) are constant for 
all y, hence 


dpi.v) = don) = 0,5 -—% Sv S om, (75) 


It is easily seen that M4,(v) and Mo2(v) are real and have no poles or 
zeros in J;. Therefore, J, contains no points of the spectrum, and 


pi) => p;(— ©), dp;;(v) = 0 vel, Qj = 1, 2). (76) 
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In the interval Iz, Myi(v) and Mee(v) can each have a finite num- 
ber of poles, and from (72) it follows that the poles M4;(v) are the 
zeros Of Mo2.(v) and vice versa. The real poles of My,(v) are the real 
solutions of 


w, Sin (w,w) + tw. Cos @w,w) = 0 (77) 
and the real poles of Moo are the real solutions of 


COS (w,W) — tWwo/w,) sin @w,w) = 0. (78) 
For v ¢ Ig, wy is real while wo is purely imaginary. If we let 
bv) = o,(%), pe) = —two(r) = (—» — BK), (79) 


then (77) in the single unknown » can be replaced by the set of three 
equations 


—v=hK,+ 27, —y =k’K,— 0°, bdbtanbw=p, (80) 
in the two positive real unknowns 6 and p and the original unknown 
v. Similarly, (78) can be replaced by the set of equations 


—v=hK, +27’, —y = k’K, — 0’, b cot bw = —p. (81) 
These equations are well known and their solutions have been deter- 
mined.® ° The set of equations (80) has a finite number of real solu- 
tions and always has at least one solution for all positive values of the 
parameters, w, k, K, — Ko. These are the even modes of NM. We 
denote corresponding values of »v by 14;, 7 = 1, 2, ::-, Ri. The set of 
equations (81) also has a most finite number of solutions, although if 
(wk)? x (Ky — Ko) is small enough it has no real solutions. These are 
the odd modes of NM. We denote the values of v corresponding to these 
roots vaj,j = 1, 2, +++, Re. The points 11;, ve;, which are all in the in- 
terval Jz, comprise the point spectrum of (21). Let 


dp(v) = lim {pv + &—) — pv — 6}. (82) 
Then with the aid of (62) it is easy to show that 
SpuQ1s) = DOii)/{1 + wpa}, Sp22%1;) = 0, 7 = 1,2,--+, Ri, (83) 
BpisQ2;) = 0, — Spro(v2;) = B°(r2;)p2s)/{1 + wplrai)}, 


g=1,2,---,Rk.. (84) 


With the aid of (65) through (69) and (77) through (79) it is readily 
shown that 
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g(x, 1;) = cos (b,,)2), |al sw (85) 
= cos (b(41;)w) exp {p@i,)(w—|a])}, |r| 2w (86) 
go(%, v2;) = {1/b(v2,)} sin (b(2,)2), |e| Sw (87) 
= {1/b(>2;)} sin (b(2;)w) exp {pQ2,)(w — 2)}, «2w (88) 
glX, ¥2;) = —g(—at,r2;). 2S —w (89) 


It is also true that 


oo 


i g(x, Vin) dz = 1/6p;;;), 


The remaining points in J, are not in the spectrum. 

Finally, in the interval I, it is readily shown that M41(v) and Moo(v) 
have no poles. It is shown easily then that the whole interval J; is in 
the continuous spectrum, and in this interval 


> 


=1,2,--+,&,, j= 1,2. (90) 


dp;;v) = piv) dv = (j = 1, 2), (91) 

where 
pio) = 5 [we sin? (ou) + 08 cos? (w,r0)) "a , (92) 
pi(v) = = [wy cos” (ww) + wo sin? (w,w)] ww « (93) 


In summary, the spectrum of (21) consists of the points vj, k = 
1,2, ---, R;,7 = 1, 2, and the interval I;. Equation (27) for the trans- 
mitted field can be written as 


2 R; 
ey (a, 2) = 2d 2d, pj; Vin give; (z, Vix) exp 4 V Vix 2 
71 k=1 


2 


+f exp (-iV=valeile, Nooo) dv 


7=1 


+ Of ow (-Veetese, Nasedelie) dr (94) 


imi 
The terms in the first, double summation in (94) are just the possible 
TE modes which can be excited in the waveguide. The terms in the 
second summation represent the propagating continuum field while 
the terms in the last summation represent the evanescent part of the 
transmitted field. A useful interpretation of the propagating continuum 
field can be obtained as follows. Consider within the waveguide in the 


1508 THE BELL SYSTEM TECHNICAL JOURNAL, SEPTEMBER 1967 


region x < —w an incident plane wave of the form 


E(x, 2, v) = exp {—tV —v2z — tw(r)z}, (95) 


so that if 6 is the direction of propagation of this wave (measured 
clockwise from the positive z axis) , then 


cos 6 = V—v/kVWKy, sind = w(v)/kW Ko - (95) 


On striking the region of higher dielectric constant, |x] < w, part of 
this wave will be reflected and part of it will be transmitted through 
the region |x| < w. Denote by x+(2, 2, v) this total electromagnetic 
field set up by the incident wave, (95). Similarly, denote by y~ (a, 2, v) 
the total electromagnetic field set up by the incident wave in the region 
x>w 


E(x, 2, v) = exp {-—i1V —v2 + iw(r)z}. (97) 


In Fig. (3) we give a schematic description of x, and x. . Then it can 
be shown that for —k’K, S v S 0, 


exp {—7 VY zho,(z, v) = a;(v)x.(2, @, v) + b;(»)x_(a, é, v) Gg =1, 2). 
(98) 


For the above values of v the directions of propagation of the incident 
waves for x, and yx-_ fill the interval —7/2 S @ S 7/2. Thus, the prop- 
agating continuum field is just a wave packet of plane waves appropriate 
to the medium defined by the dielectric tensor K,(a). 

Similarly, the evanescent part of the field can be interpreted as a 
superposition of waves bound to the surface z = 0 and propagating in 














¢ x 
Ww Ww 
z Z 
-Ww -w 
(a)XxX+ (b)x - 


Fig. 3—A schematic diagram of the plane waves appropriate to the dielectic 
medium in the symmetric step model. The wave x, is incident on the junction 
region from the positive x direction, while x- is incident from the negative x 
direction. 
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the positive and negative x directions. The distinction between the 
propagating and evanescent parts of the transmitted field is further 
shown in the expression for the time averaged transmitted power, (29), 
which for the symmetric step model is 


2 Rj; 
P, = (Qasto) » > V Vix | giv ix) |’ 5p; ;(v jx) 
j=1 k=1 


+ Gon? Of VFL a0) P sO) de. (99) 


As this expression shows, the evanescent part of the field transmits no 
energy on the average. 


3.3 TM Fields For Symmetric Step Model 


The 7 fields of the symmetric step model can be treated similarly. 
Equation (48) has constant coefficients in the two regions |x| < w 
and | x | > w. Since e,(a, z) and h,(z, z) must be continuous at 7 = tw, 
the solutions of (43) must be such that y;(2, v) and {1/K,(x)}¥i(a, v) 
(j = 1, 2) are continuous. We have 


v(x, v) = cos (K,w,2), |z| sw (100) 
= cos (K,w,w) cos {wo(| 2 | — w)} 
— {(@,Ko)/(woK,)} sin (K,w,w) sin {wo(]z|—w)}, |2|2w (101) 
y.(t,r) = {K,/w,} sin(Kw,2), |x| sw (102) 
= {K,/w,} sin (K,w,w) cos {w(x — w)} . 
+ {Ko/wo} cos (K,w,w) sin {w(e — w)}, «c2w (103) 
y(t, v) = —y.(—2, v), xs-w (104) 
where 
K, = (K.K.)', -K, = (K./K,)', (105) 


and wy, and wo are defined in (70). Next, 


Mav) = —M_x(v) = {(w-/K,) sin (K,w,w) + two/Ko) cos (K,w,w)} 
-{cos (K,w,w) — i(K,wo/Kow,) sin (K,w,w)}~*. (106) 
Therefore, | 
My.) = —1/{4M2.0)} = 1/{2m_..0)}, (107) 
M,.(r) = Mor) = 0, (108) 
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and from (57) and (108) we have 


do,.(v) = don(v) = 0, —08. <p < 0, (109) 

The spectrum in the case of TM fields is determined in the same 

way as in the case of TE fields, and we merely state the results. There 
are no points of the spectrum in the interval I, = [—o, —k’K,], 


o;;(v) = o;(—), do;;(v) = 0, vel, (j = 1, 2). (110) 
The interval Iz = (—k?K,, —k?Ko) contains a finite number of points 
in the point spectrum. The points of discontinuity of o1:(v) are the 
real solutions of 


(w,/K,) sin (K,w,w) + t@/Ko) cos (K,w,w) = 0, (111) 


while the points of discontinuity of o22(v) are the real solutions of 


cos (K,w,w) — i(K,w./Kow,) sin (K,w,w) = 0. (112) 
If we let 
bv) = Kw), pv) = —tw) = (—vy — PK)’, (113) 


then (111) in the single unknown » can be replaced by the set of equa- 
tions 


—v=kh’K,+ 9p’, —v = kK, — K,b’/K,, 0K, tanbw = pK,, 

(114) 
in the two positive real unknowns b and p and the original unknown 
v. In the same way, (112) can be replaced by the set of equations 


—yv=hK, +p, —v=k’K, — K,b'/K,, 6K, cot bw = —pK,. 

(115) 
The set of (114) has a finite number of real solutions and for all posi- 
tive values of the parameters Ko/K,, Kz/Kz, w, k?(K, — Ko) there is 
always at least one solution.* * These are the even modes of NM. The 
corresponding values of v are denoted by 11;, 7 = 1, 2, ---, S1. The set 
of equations (115) also has at most a finite number of solutions, al- 
though if (wk)? (K, — Ko) is small enough it has no real solutions. 
These are the odd modes of NM. The corresponding values of y are 
denoted by vj, 7 = 1, 2, +++, Se. The points v;, ve; are the point spec- 
trum of (43) and they all lie in the interval J,. Furthermore, 


6043 (¥1;) = S(p(1;)), 5022(¥1;) = 0, j = 1, 2, ae S; ) (116) 
5a11(¥2;) = 0, So2e(v2;) = b(2;)’S(p2;))/Kz , 7 = 1,2, -°--, S82, (117) 
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where 


K?K,K.(K, — Ko) ii (118) 


aS Kp| wp + CCK, — Kp’ + PRA, — K) 
From (100) through (104) and (111) through (113) it follows that 


¥1(&, v1;) = cos (b(1;)2), |c| Sw (119) 

= cos (b(,,;)w) exp {pbi)(w—|r)}, |e|2w (120) 

a(x, v2j) = {K./b(v2;)} sin (bQ2,;)z), |x| Sw (121) 
= {K,/b(v2;)} sin (b(2;)x) exp {p2;)(w— 2}, 22 wv, 

(122) 


y2(x, V2;) = — ~(—2, Voi). v 
It is also true that 


IIA 
| 
S 


(123) 


v(x, vin) {K(x)}™* dx = 1/60;;,0,), k = 1,2,---,8;, j= 1,2. 


(124) 
The remaining points in J. are not in the spectrum. 
The continuous spectrum is the interval Iz; = [—k?Ko, o]. For 
points of the continuous spectrum 
do;;(v) = o4;(v) dy G = 1, 2), (125) 
where 


Eno > IK? 22 gin? (Kyo,t0) 
+ K,K,w cos’ (K,w,w)|*K K,K.w» , (126) 
1 
o2(v) = 5, | ow, cos’ (K,w,w) 
+ K,K,w sin’ (K,w,w)] Kowa . (127) 
To summarize these results, the spectrum consists of the points v;,, 


k = 1,2,---,S;,7 = 1, 2 and the interval J;, and the transmitted field 
can be written in the form 


hy (a, 2) = 2d > 0; (vjx)Rs Vix) Vi(@, Vix) EXP [iV 95, 2 
+f exp [iV 2h ¥4(@, Dhi)ots) a 


7=1 


x aa exp | Vv 2} vi(a, v)hj(v)o4;(%) dv. (128) 
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Just as for the TH fields, the terms in the first, double summation in 
(128) are the possible 7M modes which can be excited in the wave- 
guide. The terms in the second summation represent the propagating 
continuum field while the terms in the last summation represent the 
evanescent part of the transmitted field. Just as for the TF fields, the 
propagating part of the continuum field can be interpreted as a wave 
packet of reflected and refracted plane waves, and the evanescent part 
of the field can be interpreted in terms of surface waves at z = 0. 
Equation (53) for the transmitted energy is 


Pp = (2weo)* > > V Vik | hx) |’ 507; ;(V jx) 


1 k=1 


+ (Que)? > [. V —v | hv) |? of,(v) dv. (129) 


3.4 TH Fields For Asymmetric Step Model 


We now turn to the second of the two models which are studied in 
detail and examine the TE fields for the asymmetric step model. The 
functions K, (7) (n = 2, y, z) are defined by (8) through (5). Equa- 
tion (21) has constant coefficients in the regions |z| < w,2 > w,x < 
—w, and we seek solutions which are continuous and have continuous 
first derivatives. Then 


gi(t,r) = cos), |x| sw (180) 


COs (w,w) cos {w(x — w)} 


— (w,/we) sin (ww) sin {w(x — w)}, raw (131) 
= COs (w,w) cos {w,(x + w)} 
+ (w,/,) sin @,w) sin fo,(@@ + w)}, «2S —w (182) 
g(x, v) = (1/e,)sin@w), |x| sw (133) 


I 


(1/w,) sin (w,w) cos {wo(a — w)} 
+ (1/w,) cos (w,w) sin {a(x — w)}, x>w (134) 

—(1/w,) sin @,w) cos {w,(z + w)} 

+ (1/a,) cos @,w) sin {w,(z + w)}, 2 


IA 
| 
& 


(135) 


where 


or) =@+hK,) (=1,2,2,4). (136) 
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As before o, is defined as a single-valued function of » in the complex 
plane cut along the real axis from —k?K,, to 0. Then 


mv) = {w, Sin (ww) + tw. cos w,w)} 
{cos (ww) — t(w2/w,) sin (w)}", — (187) 
M_(v) = — {w, sin @w,w) + ta, cos (w,w)} 
-{cos @,w) — 7@,/w,) sin @w,w)}7*. (138) 
From (58) through (60) and (187) through (138) we obtain 
My(v) = Np)/De) G,k = 1,2), (139) 
where 
Nu) = —3[0 — ww2/wr) + (1 + w,02/0;) cos (20,0) 
— i{@, + w2)/wy} sin (2w,w)], (140) 
Nix) = Nav) = (t/2)(@1 — 2), (141) 
Noo) = 3{(, — @y2) — (@, + wie) cos (2u,w) 
+ iw,(w, + we) sin (2,w)}, (142) 
Dv) = (, + w&i02/w,) sin (Qwyw) + iw, + ws) cos (2w,w). (148) 
To determine the spectrum we note first that in the interval I, = 
[—o, —k?K,], the functions Mj,(v) (j, k = 1, 2) are analytic and 
real. This interval, therefore, contains no points of the spectrum and 
dp;.(v) = 0 (j,k = 1, 2), ye lis (144) 


The only real poles of the functions M,,(v) are in the interval I, = 
(—k?K,, —k?K,). These poles are the real solutions of D(v) = 0. In 
Ig, wy is real while w; and we are purely imaginary. If we let 


b(v) ~ w,(r), Dal?) = — tw, (v) = (=9 — K,) (n a I, 2), (145) 

then the equation D(v) = 0 is equivalent to the set of four equations 

—~v=hK,+ 7), —y=hK,+ 73, —y=k’K,— 0°, (146) 
tan 2bw = {p,/b + p2/b}/{1 — (p:/b)(p2/b)}, 


in the three positive real unknowns b, pi, p2 and the original unknown 
v. These equations and their solutions have also been studied in de- 
tail.>»* In order that (146) have a solution, it is necessary and suffi- 
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cient that 


K,>K, (=1, 2), 

2wk(K, — K,)} > tan? {(K, — K.)/(K, — K)}?. 
If conditions (147) are satisfied, D(v) = 0 has a finite number of real 
solutions, v;, 7 = 1, 2, --+, & which all lie in the interval I2. This is 
the first significant difference between the symmetric and asymmetric 
step models. The symmetric step model always has at least one point 
in its point spectrum while the asymmetric step model may have no 
point spectrum. 

We can write, assuming that (146) and (147) are satisfied. 


(147) 


Sp jx(v1) = —Nivr.)/D'(r), i; k= 1, 2, l= 1, 2, i: fk, (148) 
where D’(v) = (d/dv) D(v). If we make use of (145), it is easy to show 
that 

{ 8pi2(¥1) }? = 5p11(¥1) bp22(v1), l= he eae Le (149) 


Neither of the functions ¢,(x, v;) or ¢2(z, v;) is square integrable over 
—-%2 <a< o forj = 1, 2, --: , R. However, because of (149), they 
appear in (27) for eS” (x, z) only in the combination 


B(e, ¥j) = V Spits) vile, ¥4) 
+ {8pi2(05)/V dpuei)}yo@,r;), § = 1,2, +++, R. (150) 
If we define 
@)(x,»;) = V 6pi,(v;) cos (b(v,)2) 
+ {6pi2(0;)/-V don(;) b@;)} sin (b@;)2), (151) 
then because of (146) 
P(x, v;) = B(x, vj), l<e| sw (152) 
®,(w, v;) exp {p.(v;)(w — x)}, ea} (153) 
®o(—w, v;) exp {pi0;)(w+z)}. «eS —w (164) 


Thus, the functions &(x, v;) are square integrable, and, as we shall see, 
are just the possible propagating modes in the wave guide. The re- 
maining points in the interval J, are not in the spectrum. 

The remainder of the real axis, the interval —h’K, Sv S «, forms 
the continuous spectrum. To show this, consider first the interval 
I, = [—k?K,, —k’K.]. In I; , w, and w,; are real, while w. is purely 


I 
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imaginary. The functions J/;,(v) have no poles in J; and their imaginary 
parts are not zero. We introduce the notation 

w,(v) = b(v), w,(v) — p,(v), wo(r) > tp(v), vpeé I; . (155) 


Then we can write 


don) = + {p.Q)/AM Ono) d& G,k=1,2), (156) 


via 


where 
ri(v) = cos bw + (p,/b) sin bu, (157) 
rv) = bsin bw — pz cos bu, (158) 


Av) = {bsin 2bw — p, cos 2bw}? 
+ {(pyp2/b) sin 2bw + p, cos2bw}?. — (159) 


For v ¢ I; it is clear from (131), (134), and (155) that ¢,(a, v) and ¢.(z, v) 
both grow exponentially as x — -+«. However, from (156) we see 
that in (27) for eS" (2, z), the functions ¢;(z, v) (j = 1, 2) appear only 
in the combination 


A(z, v) = rQ) g(a, v) + ralv)¢2(@, ») (160) 

when v e« J;. However, 
A(z, v) = cos {b(x — w)} — (p2/b) sin {d(x — w)}, |2|<w (161) 
= exp {[p(w—2z)}, wt2w (162) 


I 


(cos 2bw + (p2/b) sin 2bw) cos {pi\(~ + w)} 
+ (1/p,)(b sin 2bw — p, cos 2bw) 


‘sin {pi(z + w)}, aS —w. (163) 


Equations (161) through (163) represent the second important differ- 
ence between the symmetric and asymmetric step models. In the sym- 
metric model all the components of the continuum field are oscillatory 
functions of x on both sides of the waveguide while in the asymmetric 
model some of the components of the continuum field are exponentially 
damped on one side of the waveguide. The physical interpretation of 
A(x, v) will be discussed later. 

In the remaining interval, I, = [—k?Kez, o], the functions o, (n = 
1, 2, y) are all real and the functions Mj,(v) (j, k = 1, 2) have no 
poles. Therefore, 


dpixlv) = pialv) dr, (164) 
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where 
On ~ Ges, Hees east ian) ree Sie aN, (165) 
piv) = ph(v) = * ule — w.)(w, + wwe) sin (w,,) cos (w,w)/D, (165) 


pia(e) = 2 ull, + ca) fa sin’® (oyto) + exes cos? (w,tt)}/9, (167) 


Div) = (w, + ww.) sin? (2w,w) + w7(w, + w2)? cos’ (2w,w). (168) 


The spectrum for the TF fields of the asymmetric model consists of 
the (possibly empty) set of points v;, 7 = 1, 2, --+ , R and the interval 
—k’K, Sv S o. The transmitted field can now be written in the 
following way. 


ey (2, 2) = > pe dpii(v;)} 3 6 0..(0;) gar; »} exp {~—4 VV); 2} P(x, v;) 


j=l 


WT J-k°K, 


+ 1 3 xe exp {—-71V —vz}A(z, n> ?; (aio incor dv 


+ > [exp (iV alone, aot) ah 
+ [exp (-Voelosle, dao)ent) av, (169) 


The expression for e{'’(z, z) has been split up into a sum of parts in 
order to facilitate its physical interpretation. The first part represents 
the possible discrete, propagating modes which can be excited in the 
system. The form of these modes has been studied in detail elsewhere,°’° 
and as pointed out earlier, unless condition (147) is satisfied, no such 
modes can be excited. In order to interpret the second term, consider 
within the waveguide in the region x < —w an incident plane wave 
of the form 


e, (a, 2, v) = exp {-tV —v2z — tw,()z}. (170) 


At the surface x = ~—w, part of this wave will be reflected and part 
will be transmitted. However, at the surface x = w, the wave will 
suffer total internal reflection. The total electromagnetic field set up 
by e{ (zx, 2, v) is proportional to A(z, v) exp {—i~/—»z}. The second 
term is then just a superposition of plane waves which are totally 
reflected at x = w. In Fig. 4 we give a schematic description of these 
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Fig. 4— A schematic diagram of the totally reflected wave in the asymmetric 
step Pagdel. The wave is incident on the junction region at s = —w where it is 
partly reflected and partly transmitted. The partly transmitted portion is then 
totally reflected at x = w. 


waves. In microscopy the theory of the Becke line is based on just such 
a superposition of totally reflected plane waves.’° The third term is a 
superposition of plane waves which are reflected and refracted at 
x = -tw. The last term is a superposition of waves bound to the surface 
z = 0 and propagating in the positive and negative x directions. 

The time averaged, transmitted power is 


1 = (2apo)* > V —v1 | V bpis(rr) gilrr) 
“F { Spr2(v2)/-V Spir(r2) } gor) |? 
+ Canny? [| Vr Inne) + ree)ae) F {p:0)/A@)} 


+ Gans f  V=4 S worracronco} av 71) 


3.5 TM Fields For Asymmetric Step Model 


The 7M fields for the asymmetric model present no new features, 
and we merely record the results. We have 


v(x, v) = cos (Kiw,2), |x| Sw (172) 
cos (K,w,w) cos {w(¢ — w)} 


— (@,K,/w,K,) sin (K,o,w) sin {w,(z — w)}, «2 w (173) 


= cos (K,w,w) cos {w,(a + w)} 


+ (w,K,/w,K,) sin (K,w,w) sin {w,(e¢ + w)}, 2 S —w (174) 
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¥2(z,v) = (K,/w,) sin (K,w,2), |x| sw (175) 
= (K,/w,) sin (K,w,w) cos {w(2 — w)} 
+ (K2/w.) cos (K,w,w) sin {w(x — w)}, xZ2w (176) 


= —(K,/w,) sin (K,w,w) cos {w(x + w)} 
+ (K,/w;) cos (K,o,w) sin {w,(a + w)}, 2S —w (177) 


where wn(v) (n = 2, 1, 2) are defined in (136) and K, and K, are de- 
fined in (105). Next, 


Mav) = {(w./K,) sin (K,.w,w) + i@2/K2) cos (K,w,w)} 
-{cos (K,w,w) — t(w.K,/w,K,) sin (K,w,w)}*, (178) 
Mov) = — {(w,/K,) sin (K,w,w) + t@,/K,) cos (K,w,w)} 
-{cos (K,w,w) — i@,K,/w,K,) sin (K,w,w)}". (179) 
Then from (58) through (60) , (178), and (179) we obtain 
Mv) = Ny)/De)  G,k = 1, 2), (180) 
where 
Nu) = —3[1 — ww. KG/w2K Ky) 
+ (1 + w,0.K?/w2K,K,) cos (2K ,w,w) 
— UK,/w;)(o,/K, + 2/K,) sin (2K,0,w)], (181) 
N12) = Nav) = (¢/2)(o,/Ki — w2/K2), (182) 
Nos) = $[(@:/K; — w02/K,K2) 
— (@;:/K, + w,/K,K2) cos (2K,w,w) 
+ t@,/K,)(o/K, + 2/K2) sin (2K,0,w)], (183) 
D?) = @./K, + ww2K,/w.K,K2) sin (2K ,0,w) 
+ t@,/K, + @2:/K2) cos (2K,0,w). (184) 


There are no points of the spectrum in J; = [— 0, —k?K,]. The 
only real poles of the functions M;,(v) are in the interval Iz = (—k?Kz, 
—k?K,). In Io, wz is real while o; and we are imaginary. If we let 


=“) 


bi») = K,w,(r), rl?) = —tw,(r), n= 1,2, (185) 
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then the equation determining the poles, D(v) = 0 is equivalent to the 
set of equations 


K, 
Ke 
tan 2bw = (p,K./bK, + p.K,/bK;)/(1 — pip,K2/0°KK:). 


In order that these equations have a solution, it is necessary and suf- 
ficient that> ° 





—y=hK, +71, —v=hK,.+ 73, —v =hK, — b’, 


(186) 


K,>K, (=1,2), 


2Qwk{ KK, — K,)/K,}* > tan {K,KA(K, — K.)/Ki(K, — K,)}?. 
If conditions (187) are satisfied, D(v) = 0 has a finite number of real 
solutions in I2,v;,7 = 1,2, --°:,S. 

If (186) and (187) are satisfied, we can write 


(187) 


50 ;4(02) = —Ni,Q)/D'od, 4; k= i; 2, l= 1, 2, an Ds (188) 
Just as for the TF fields, it is true that 
{ 6or2(v1)}” = 66 11(01) bo22(2), l = ie 2, SEG S. (189) 


Because of (189) the functions y,(2, v:) and y¥.(a, v,) appear in (49) 
for hi (a, z) only in the combination 


W(x, v3) = V don;) vale, »;) 
=F { 60y2(v;)/ V 5o11(¥;)} Pola, vi); pS 12 Ot gD (190) 
If we define 
V(x, v3) = V bo410;) cos (b(,)x) 
+ {[K.80120;)/-V ou,) 6@;)} sin (6,)z), (191) 
then because of (186) 
W(x, v;) = V(x, »;), || sw (192) 
= V,(w, »;) exp {povj)(w—2)}, «2w (193) 
= V(—w, »;) exp {pie;)(w+.2)}. «SS —w (194) 
The remaining points in J» are not in the spectrum. 
The remainder of the real axis, the interval —k’K, S v S © forms 


the continuous spectrum. In the subinterval I; = [—k°K,, —k’K,), 
w, and w, are real while w, is imaginary. If we let 


K,w,(v) ae b(v), «(r) oa Dil), wo(v) = tpo(v), ve I; ) (195) 
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then we can write 


don) =» {p,0)/K,A0)}8,0)3,0) dv Gj, = 1,2), (196) 


vis 


where 
si”) = cos bw + (p,K,/bK,) sin bu, (197) 
sv) = —(p2/K,) cos bw + (b/K,) sin bw, (198) 


A(v) = {(0/K,) sin 2bw — (p./K.) cos 2bw}? 
+ {(p.:p.K,/bK,K,) sin 2bw + (p,/K,) cos 2bw}?. (199) 


When v ¢ 13, ¥,(z, v) and ¥.(z, v) appear in (51) for h{" (zx, z) only in 
the combination 


A(x, v) = 8,(v~) W(x, v) + so(v) p2(a, v). (200) 


We have 


cos {b(x — w)} 
— (p.K,/bK,.) sin {b(a — w)}, l<e|sw (201) 
exp {p(w—2z)}, «c2aw (202) 
{cos 2bw ++ (p.k,/bK.) sin 2bw} cos {p,(x + w)} 
+ (1/p,){(0K,/K,) sin 2bw 
— (p.K,/K.) cos 2bw} sin {p(x + w)}, x S —w. (208) 


=(x, v) 


I 


In the remaining interval, J, = [—k?Ke, o], the functions o, (n = 
1,2, x) are all real and we can write 


do.) = of,(v) dv, (204) 
where 


ah(o) = = (a,/Ky + on/K,) {(o%/K2) cos? (Kyat) 
+ (w,w2./K,K,) sin’ (K,w,w)}/D, (205) 
sfa(e) = of:0) = = (w./K,)len/Ky ~ w2/Ke) 
{u2/K3 + oyoo/K,K,} sin (Kya.w) cos (Kyset)/D, (206) 
chal) = © (ea/Koex/Ki + o/Ks){(o1e2/KyK:) cos? (Kyou) 
+ @,/K,)’ sin’ (Kww.w)}/D, (207) 
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D = {@,/K,) + (@.2/K,K.)}’ sin’ (2K,w,w) 
+ (@,/K,)’(o,/Ki + @2/K2) cos’ (2K,w,w). (208) 


To summarize, the spectrum for the 7 waves of the asymmetric 
model consists of the (possibly empty) set of points y,; , 1 = 1, 2, ---,S, 
and the interval —k’K, < v S o. The transmitted field can be written 
as 


hy? (a, 2) = > { 6011(0;)}7* douesdhe(,) exp {-iV 9; 2}W(x, v;) 


=1 


~ 
ft 

_ 

tog 


+ iz ; exp {-iV —v 2} 5(c, r) 2 8ihi(e) {Pi)/K,AG)} a 


+ Of exp (-iV ah de, Nao) & 


+ Df exp [— Vv} d(, Maoh) dv. (209) 


The time averaged, transmitted power is 


Py = (2ae)" 2) Vv] Vout) kno) + (o2)/Voned ya) FP 
+ Qva)* f° V =r |s:@)h@) + Oia) F feiG)/KiSO)} a 
+ Qve)* [| Vr bO*ab)oio) a (210) 


IV. APPROXIMATE SOLUTION OF THE INTEGRAL EQUATIONS 


In Section II we obtained general expressions for the reflected and 
transmitted fields for the TH fields in (18) and (27) and for the TM 
fields in (41) and (51). In (27) and (51) there appear the functions 
¢;(x, v) and y,(x, v) and the spectral density matrices p(v) and o(v). 
A technique for determining these quantities in certain cases was 
illustrated in Section III by explicitly calculating them for the sym- 
metric and asymmetric step models. In order to complete the determina- 
tion of the reflected and transmitted fields, the functions &{” (1), 3c<” (J), 
gx(v), and h,(v) must be calculated. In Section II we showed that 
these functions were determined by the integral equations (30)-(31) 
and (54)—(55). 

We have been unable to solve these integral equations exactly for 
the general case. However, there are certain cases of great physical 
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interest, such as the electro-optic diode modulator, where excellent 
approximate solutions can be obtained. Let 


M, = max K,(2), m, = min K,(z), (n= 2x,y,2) (211) 


and assume that 
(M, — m,)/m, K1 (Hh. = 25-953). (212) 


Then the incident field impinges on an essentially uniform, plane 
dielectric interface, and the reflected field can be calculated as if the 
region z > 0 were a uniform dielectric. Let K, (n = 2, y, 2) be suitably 
chosen, constant values for the dielectric tensor for z > 0. Then it is 
readily shown that for the TF fields 


&"(l) = R.DEr (0), (218) 
and for the 7M fields 
5e,() = RNa, (D), (214) 
where the reflection coefficients are 
RAD = {QD — kb, Q(1/k,)} {QD + k, 0/k,)}™, (215) 
RAD) = {k,Q) — A(1/k.)} {h.AD) + Q1/k.)}~, (216) 
k, = (Kyi (n= 2, 9,2), (217) 


and Q(l) is defined in (16). In this approximation, the total fields at 
z =O forthe TE and TM fields are, respectively, 


2G0):= + i THe (De ™ dl, (218) 


aCe ee * [ T,()5e (De dl, (219) 
where the transmission coefficients are 


TID=1+R(), n=e,h. (220) 


Now that e,(z, 0) and h,(z, 0) are known, g;(v) (7 = 1, 2) can be cal- 
culated from (28) and h,;(v) (7 = 1, 2) can be calculated from (52), 
since e,(z, 0) = ef" (x, 0) and h,(x, 0) = hi (2, 0). 

We illustrate some features of the calculation of g,(v) and h,(v) with 
the symmetric and asymmetric step models. We first note that if these 
models are used to study an electro-optic diode modulator, typical 
values of the parameters defining the dielectric tensors in (1) through (7) 
are’ n = 3,31, A210, 5, &2 X 10* (n = 2, y, 2), Ai = 0.96A, 


PLANAR DIELECTRIC WAVEGUIDES 1523 


A, = 1.04 A. Then M, — m, & 1.4 X 10°’, m, & 10.9. Condition 
(212) is thus well satisfied. 

For the symmetric step model we let K, = Ky (n = 2, y, 2). If the 
functions &{ (1) and 3c‘ (1) are sharply peaked about 1 = 0, then 
(218) and (219) can be further approximated by 
al 8; (De ** dl = T.O)e,"(a, 0), (221) 
h(a, 0) = TO) hy” (x, 0). (222) 


The calculation of g,(v) and h;(v) is now reduced to quadratures. If 
the incident field is not sharply peaked, we define 


e,(x, 0) = T.(0) 


e(l,) =f - ela, eo" de, (228) 
¥ilt,0) = 5 [ Yulo, )Mle(a)\ ede, (224) 
so that 
o@ = [ : PDE ()%,(1, ») dl, “(225) 
h,v) = / : Tay? (DY,(1, ») dl, 7 = 1,2. (226) 


If y is in the continuous spectrum, #;(l, v) and &;(l, v) are distributions 
which are easily determined with the aid of the relation"? 


if c e’*” dx = 1/(tc) + rS(o), (227) 


where 6(c) is the delta function and when 1/c appears under an integral 
sign, it is assumed that the Cauchy principal value is taken. If v is in 
the point spectrum, %;(/, v) and W,(J, v) are ordinary functions. 

For the asymmetric step model we let K, = 3(K, + K,), (n = 2, y, 2). 
For this model, a straightforward application of (28) and (52) fails 
in general if y is the point spectrum or if v e I; , because ¢;(z, v) and 
¥,;(x, v) now grow exponentially as x tends to either plus infinity or 
minus infinity. This apparent difficulty is merely a reflection of the 
manner of convergence of the integrals defining g,(v) and h,(v). For 
our purposes here, it is enough to note from (169) and (209) that 
when » is in the point spectrum, the functions g,(v) and h,(v) do not 
appear independently, but only in the linear combinations 
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2 0 


p> { 6p11s)} Spurge) oe is e,(z, 0)P(z,v;) dx, j = 1,2,---, Rf, 
(228) 
pa { 804,(¥;)} 7? 6044(v;)he(v,) = a hit: 0)V(z, v;) at, j = 1, 2,° rey S. 
(229) 


The integrals on the right of (218) and (219) are now well defined. 
Similarly, if » e Z; , the relevant quantities to calculate are 


2 foo} 


> n@)o0) = J este, AC, ») de, (230) 
> SX) = i : h(x, 0)E(a, v) dz. (231) 


If ye I, , (28) and (52) can be applied directly. Now, all the techniques 
discussed in the case of the symmetric model can be applied here. 


V. SUMMARY 


In Section I we have defined a class of dielectric waveguide models. 
The waveguide is formed by an anisotropic, nonuniform dielectric 
filling the half space z > 0. The dielectric tensor is diagonal in the 
fixed coordinate system of Fig. 1, and the diagonal matrix elements 
are functions of x only, K,(x) (n = a, y, 2). 

Integral representations for the incident, reflected, and transmitted 
fields were given in (15), (18), and (27), respectively, for the TE fields, 
and in (89), (41) and (51), respectively, for the 7M fields. These rep- 
resentations are very general, holding for a large class of functions 
K,(«) and incident fields. These integral representations, however, con- 
tain the unknown functions ¢;(z, v), ¥;(x, v), p;.(v) and o;,(v) (j, k = 1, 2), 
which are determined solely by the dielectric tensor, K,(x), and the 
unknown functions g,(v), hi(v), (k = 1, 2), 6§°(D, and 3e§ (I), which 
also depend on the incident field and the boundary conditions at z = 0. 
It was shown that this latter group of unknown functions are the solu- 
tions of two sets of integral equations, (30)-(31) for the TF fields 
and (54)—(55) for the 7M fields. These equations are very complicated, 
and we have been unable to solve them exactly for any specific models 
of interest. 

In Section III we gave a detailed calculation of the functions ¢;(z, v), 
W;(x, v), pj(v), and o;.(v) G, & = 1, 2) for both the symmetric and 
asymmetric step models. These calculations are important in their own 
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right, since the symmetric and asymmetric step models have been 
used extensively in the study of the electro-optic diode modulators.*~° 
However, these computations also illustrate the technique for treating 
the whole class of piecewise constant models. This is important, for 
it is not yet completely established which is the correct model to use 
in exploring the behavior of the electro-optic diode modulator, and 
it is felt that any actual physical situation can be well approximated 
by a piecewise constant model. 

It should be noted that the success of the techniques used in this 
paper depends on being able to obtain exact analytic solutions of (21) 
and (48), or at least good analytic approximations to these solutions. 
There are a number of other models for which the exact solutions of (21) 
can be obtained, for example the continuous dielectric constant models 
described in Section III of NM. It is, however, much more difficult 
to find models, other than the piecewise constant models, for which 
(43) is solvable in terms of known functions. Nevertheless, the pos- 
sibility remains of investigating the TF fields for a fairly wide varity 
of models. 

The calculations of Section III provide a method of determining 
the discrete modes which is different from the methods used in earlier 
treatments.”’*"® These calculations showed also that the asymmetry 
of the background light is accentuated in the asymmetric step model 
by total internal reflection at the junction region boundary. 

Finally, in Section IV it was shown that good approximations can 
be found for the functions g,(v), hi(v), 3S (1), and 8)” (1) in certain 
cases of physical interest. In particular, these approximations are valid 
for the electro-optic diode modulator. These approximations do not 
depend on a particular choice of the incident field. 

The final results of this paper then are integral representations for 
the fields for both the TE and TM fields. Of the various functions in 
the integrands, some have been determined exactly and good approxi- 
mations have been found for the remainder for a number of important 
models and for arbitrary incident fields. 

These integral representations are complicated in appearance, but 
when z is large enough, asymptotic expansions of them can be found 
which lend themselves to numerical analysis. In a subsequent paper 
asymptotic expansions of the transmitted fields will be presented for 
the symmetric and asymmetric step models in the case that the inci- 
dent field is Gaussian and numerical results for cases of experimental 
interest will be presented. 
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Demagnetizing Fields in Thin 
Magnetic Films 


By D. B. DOVE 
(Manuscript received January 31, 1967) 


Demagnetizing fields play an important role in the operation of many 
thin magnetic film devices. A requirement of high packing density leads to 
strong localization of induced changes in magnetization; and, therefore, to 
correspondingly large demagnetizing fields and drive currents. A treatment of 
the demagnetizing field problem for thin film materials is given here for 
film properties and fields which are nonuniform along the hard anisotropy 
axis. Specifically considered are saturating fields, variations in film thick- 
ness and anisotropy constant, interaction between films, and the effect of 
easy direction bias fields. 


I, INTRODUCTION 


The behavior of the magnetization in thin magnetic films of large 
lateral extent subject to a uniform applied field may be calculated 
directly from a knowledge of film properties and field strength. The 
calculation of the behavior of magnetization in the presence of non- 
uniformity of film properties or of applied field, however, must take 
into account the demagnetizing field that arises from a local non- 
uniformity of magnetization. Such a situation occurs in many problems 
of practical interest. Internally generated fields give rise to a number 
of effects when nonuniform fields are applied to thin uniaxially ani- 
sotropic films.1.? For example, the hard axis field required for satura- 
tion may be several times the anisotropy field and the induced mag- 
netization component may spread to regions where the applied field 
is very small. The occurrence of such effects in thin films has been 
considered by Rosenberg? using a calculus of variations approach and 
by Kump and Greene* and Kump? using an iterative numerical pro- 
cedure. More recently Dove and Long® have shown that there is a 
simple solution to the nonuniform field problem in the case of non- 
saturating spatially periodic applied fields, and have treated localized 


1527 


1528 THE BELL SYSTEM TECHNICAL JOURNAL, SEPTEMBER 1967 


fields by using a Fourier series technique. Good agreement was found 
with Kerr-effect probe measurements on flat and cylindrical permalloy 
films. 

The purpose of the present work is to show how the Fourier series 
technique permits straightforward solution of a number of thin film 
magnetostatic problems. Flat and cylindrical film geometries are 
treated; however, the results are of special interest to the case of 
cylindrical films with axial hard direction, owing to the circumferential 
flux closure. Specifically, we consider the cases of; 


(t) nonuniform hard axis field, 

(wz) nonuniform saturating field, 

(227) variation in film thickness, 

(iv) variation in anisotropy constant, 

(v) external fields due to magnetization distribution in film, flux 

linkage with conductors, magnetic shielding, 

(vi) interaction between parallel films, keepers, and 
(vit) nonuniform hard axis field in presence of easy direction bias 


field. 


It is assumed that the quantities of interest vary along the film hard 
axis only and that properties and fields are uniform along the easy 
axis. Film thickness is taken to be sufficiently small that the direction 
of magnetization always lies in the plane of the film, exchange forces 
are neglected, being insignificant for cases considered, and anisotropy 
dispersion effects are not included. 


II. GENERAL CONSIDERATIONS 


We consider demagnetizing field effects that arise in thin uniaxially 
anisotropic films when relevant parameters vary only along the hard 
anisotropy axis. Many applications fall within this category and will 
be treated in following sections. Many of the results may be applied 
to thin films of other types of magnetic materials in the range where 
they exhibit a constant permeability, if the effective anisotropy field 
is taken to be equal to the saturation magnetization divided by the 
permeability. 

Although the demagnetizing field may be found if the magnetization 
distribution is known, and conversely a knowledge of the field enables 
the distribution to be found, there is considerably greater difficulty in 
determining both distribution and field directly. In the thin film case, 
the Fourier series technique provides a means of representing the field 
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distribution for which the demagnetizing field can be found quite 
generally. The rotation of magnetization within a film may then be 
found by balancing, for example, (for nonsaturating fields) anisotropy 
torque versus the torque due to applied field and demagnetizing field. 
This leads to equations relating the coefficients of the various series 
which in a practical application may be most conveniently evaluated 
by computer. 

The number of terms included in the series determines the resolu- 
tion with which a particular curve may be delineated. However, a 
series with, say, 100 terms may be made to fit ordinates at 100 loca- 
tions exactly, with oscillations about the required curve elsewhere. 
The procedure followed here is to use the series to calculate ordinates 
at the 100 locations, and a smooth curve is then drawn through the 
calculated ordinates. Refs. 7 and 8 have been found of value for the 
evaluation of integrals occurring in the following sections. 

Numerical examples, where given, refer to nonmagnetostrictive 80/20 
NiFe films. The films are finely polycrystalline and are characterized 
by a uniaxial anistropy. The easy direction is taken to be circumferen- 
tial in the cylindrical film case. 


III. NONUNIFORM HARD AXIS FIELD 

This case has been discussed previously® but is included here briefly 
for completeness. Let x represent distance along the film hard direc- 
tion, M is the value of saturation magnetization, T the film thickness, 
K the anisotropy constant and @(x) the angle which the direction of 
magnetization (at 2) makes with the film easy anisotropy direction. 
We now assume that the applied field H(z) may be adequately repre- 
sented over a range —A/2 to +d/2 by the series 


H(z) = > h, exp (2rinz/n) (1) 


and that the resulting hard direction component of magnetization 
M(x) may be similarly represented, 


M(x) = M >> m, exp (Qrinx/d). (2) 

The distribution M(x) gives rise to a local (positive) pole density at 

location (X, Y) of amount —div M(X, Y). This gives rise to a field 
dH at (x, y) distance R from (X, Y) given by 


dH(x, y) = —div MCX, v)- (42!) .ca)(4) 
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Fig. 1—A divergence of magnetization at (X,Y) gives rise to a field dH at 
(x, y). The x direction is taken to coincide with the film hard (anisotropv) direc- 
tion. Under no applied field the direction of magnetization lies along the y, or 
easy, direction. 


where dH is parallel to R, as in Fig. 1. Since the only variation of mag- 
netization is along the x direction, variation with thickness being ne- 
glected, then div M reduces to dM (X)/dx where M(X) is the x direction 
component of M, at X. 

The field dH has both easy and hard direction components, however, 
symmetry ensures that the resultant field H,,(x), obtained by integrating 
over the film volume, lies along the hard direction. Then, we find, for a 
flat film 


ey ‘i “  dM(X) @ — X) 
PaO we Pesce : ae ex ay 7, (3) 


where 7 is the film thickness. Substituting R = [(a—X)? + (y—Y)?]? 
and integrating over Y we have 


Ba! ° aM(X) 1 
H(z) = —27 [ ie ares toe 
Now substituting for !@(X) in terms of the Fourier series, we have 


Hae) = +2rM [yo m (22) exp QrinX/) gy 


0 n=—00 wv 


and evaluating the integral, 


H(z): = > QnM, eXp (2rina/)), (4) 
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where a, = 47°TMn/d, n > 0, and a» = ay. A similar result holds for 
cylindrical films having a circumferential easy direction, where x now 
refers to distance along the cylinder axis. In this case, we find 


a, = 4rnM(T/a)(2rna/d)71,(2rna/d)K,(2rna/d), 


where a is the cylinder radius and Jy, Ko are modified Bessel functions. 

The local rotation 6(x) of magnetization away from the easy direc- 
tion due to the applied field is determined by balancing the torque due 
to the applied field against the torques due to anisotropy and the 
demagnetizing field 


2K sin 6(x) cos 0(x) + MH,,(x) cos 6(z) = MH (x) cos 6x), all x. (5) 
We note that sin 6(x) = M(x)/M, and providing cos 6(%) ~ 0, we 
may rewrite (5) as 


2K M(x) - 
If the field is sufficiently large that 6(x) becomes equal to 7/2 then the 
film is said to have saturated (at 7) and the torque equation (5) is 
replaced by M(x) = M. In the nonsaturating case the series represen- 
tations (1), (2), (4) are now substituted in (6) giving 


Hx >. m, exp (2rinz/d) + >> am, exp (2rina/d) 


= Oh, exp (2rinz/d), 
where Hx = 2K/M. Equating coefficients of corresponding terms gives 
the result, 





M, = h,/(Hx + an). 
Hence, the series for the M(x) distribution may be obtained in terms 
of the coefficients of the applied field and geometrical parameters a, 
which automatically take into account the demagnetizing field, 


M(z) = M > ie exp (2rina/d). (7) 


As an example, we consider a wire at distance d from a flat film, 
lying parallel to the film easy direction. A current J along the wire 
produces a hard direction field component given by H(z) = 
CdI/(d?+<x?), where the origin for z is taken directly beneath the wire, 
and C is a calibration constant whose value depends on the units used, 
(C = 78.8 for d and x in mil inches, J in amperes, H in oersteds). It is 
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next assumed that the field is repeated at intervals \ along the hard 
direction in such a way that the field over one wavelength is given by 
ae dl nN 
A(x) = +e 72 1.2 ’ “9 = = 2 
To determine the Fourier coefficients we proceed in the usual way, and 
find that for d sufficiently large H (x) is given to a good approximation 
by the cosine series, 


i 


a 
2 


2018 6?" cos Qrnx/d. 


Hx) = Gt 4 2G ae 


Substituting into (7) we have 


CIMr a ICI Magee. eo 
He Xr n=1 Hx + Qn 


If such a drive wire arrangement is used to apply a field to a 
cylindrical film, there is some variation in axial field strength across 
the cylinder. In many cases of interest, the cylinder diameter is 
small compared with axial dimensions and there is very tight magneto- 
static coupling around the circumference. We therefore take the ef- 
fective axial field as that applied along the wire axis, a reasonable 
approximation for many cases. The result (8a) then applies to the 
cylindrical film case provided a, is given the appropriate value. 

When a field is applied by a circular loop of radius d around the 
film (of radius a), it may be shown that the axial field at the surface 
is given by the series, for d sufficiently large, 


Haas cx 4. 2Cir 2 Seu 2a K (224) 1(2=22) Sas at 
where J, Ig are modified Bessel functions. The field is defined over 
—\/2 to +A/2 and d > a. The axial component of magnetization in 
a, cylinder excited by such a field is then, 


ae annd 7 (224) I (2en) ae 2anx 

CIMx 4 2CIMx soe ae Me. 
Hx aor Hx + Op 

Similar results may be derived for fields applied by more complicated 
drive wire or drive strap arrangements. It can be noted that the effect 
of superimposing several applied fields results simply in superimposing 
the magnetization distributions obtained for the fields separately. 
Hence, one approach to designing a magnetization distribution of a re- 
quired shape is to approximate the shape by superimposing a set of 





M(x) = cos 2rnx/d. (8a) 





n=1 


M(x) = (8b) 
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known distributions. Many distributions of practical interest may be 
described by a cosine series and discussion in the following sections is, 
for clarity, limited to the cosine rather than the full series. Results for 
the full series may be readily derived, if required. 

Fig. 2(a) to (f) shows the relative fall off in applied field H (x) 
and in axial magnetization component J/(x) for a range of drive 
strap geometries. The plots are for a 1 »m thick cylindrical permalloy 
film of 5.0 mil diameter. Curves a, b, ¢, d correspond to drive strap 
half widths of 1.0, 5.0, 10.0, 20.0 mils, respectively. In Fig. 2(a), 
(b) the distance between drive strap (or return strap) and film axis 
is 3.5 mils. Fig. 2(¢), (d) and (e), (f) correspond, respectively, to a 
distance of 5.0 and 10.0 mils. It can be noted that the magnetization 
distributions extend to a considerable distance and do not vary as 
strongly as the applied field. The fields of Fig. 2(a), (c), (e) are 
shown to normalized scale, however, the peak field or drive current 
required to just saturate the axial component at x = O varies signifi- 
cantly with geometry, and is shown in Fig. 3. 

In a plated wire memory, the local state of a region of film may be 
assigned as positive or negative depending on the remanent circum- 
ferential component of magnetization. To read out the circumferen- 
tial component in a nondestructive manner, a local axial field is 
applied by a drive strap surrounding the wire at the location of in- 
terest, and the signal appearing across the ends of the plated wire 
is measured. The signal is due to the circumferential flux change 
integrated along the wire (neglecting capacitive or other emfs). The 
circumferential component distribution is obtained simply from the 
axial component using the relation, (circumferential) = (M? — M 
(axial)?)#, The total area under this curve is proportional to the signal 
obtained when the circumferential component has been set completely 
into one direction. It is convenient to equate the integrated circum- 
ferential component to an equivalent length of film that has every- 
where a 90° rotation of magnetization. Fig. 4 shows the equivalent 
lengths of film for the curves of Fig. 2. 

If now a locally reversed region is established and the readout field 
applied again, the signal will have decreased, since the reversed region 
contributes to the signal with reversed sign. It has been found previ- 
ously® that the presence of a domain wall has little effect on the macro- 
scopic magnetization distribution; hence, the curves of Fig. 2 may be 
used to estimate the new signal. In this case, the area under the cir- 
cumferential plot is taken negatively over the length of the reversed 
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region and positively for the remainder. Fig. 5 shows curves of net 
equivalent length versus width of reversed region. Curves a, and b 
correspond to strap half width of 1.0 mil but half separations of 3.5 
and 5.0 mils, respectively. Curves ¢ and d correspond to strap half 
width of 10.0 mils, and half separations of 5.0 and 10.0 mils, respee- 
tively. 


IV. NONUNIFORM FIELDS LARGE ENOUGH TO PRODUCE LOCAL SATURATION 


When the local effective field reaches the value H; then the local 
magnetization rotation has the value 7/2; hence, M(x) = M, the 
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Fig. 2— The curves denoted a, b, c, d refer, respectively, to a parallel drive strap 
arrangement of half widths 1.0, 5.0, 10.0, and 20.0 mils. (a) and (b) correspond 
to a strap-to-film axis distance of 3.5 mils, (c) and (d) correspond to 5.0 mils and 
(e) and (f) to 10.0 mils. (a), (c), and (e) give to normalized scale the field 
H(x)/H(0) applied along the axis of a 5.0 mil diameter, lum thick cylindrical 
permalloy film with Hx = 3.0. (b), (d), and (f) show the resulting axial mag- 
netanee components M(zx)/M due to the actual (i.e., non-normalized) ap- 
plied field. 


saturation value. A further increase in the field cannot therefore, pro- 
duce any further increase in M(x) and it is necessary to modify the 
preceding discussion to take the effect of saturation into account. 

We assume that the magnetization distribution is monotonic, and 
the width of the saturated region is specified at the outset. The cur- 
rent required to produce this degree of saturation may then be found 
for a given drive strap geometry, and the resulting magnetization dis- 
tribution is calculated. This somewhat arbitrary procedure renders 


the problem tractable. 
If the film has saturated over a region —R = «x = R then the 
material within this region has M(x) = M a constant; hence, 


dM (x) /dx vanishes within this region. It is convenient to introduce a 
modifying function S(x), having period A, that is zero over the range 
—-R Sax SR, but is otherwise unity. The product S(x)dM (x) /dx 
then has the property of being zero over —R =z = RF but is otherwise 
equal to dM (x) /dx. By introducing this product into the integral for 
the demagnetizing field in place of dM (x)/dz, we have effectively 
modified the integral without changing the limits of integration. Let 
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Fig. 3— Current in drive strap required to just saturate the film of Fig. 2 at 
x = 0, for the several drive strap geometries of Fig. 2. 


H (x) and M(z) be represented by the finite series 
N N 
H(z) = oh, cos 2rnzx/h, M(z) = M >> m, cos 2rnz/d, 
0 0 
also let S(x) be represented by a cosine series, then 
S(t) = >> s, cos 2rnx/h, 
n=0 


where for the required step function 


_ _ _4R sin 2ankt/) 
8 = 1— @R/)), 8, = x ( 2anR/r ’ mG 0. 


Differentiating the series for M (x), we have 


N 
OM cs OE Sie ea 
dz d n=0 


Then the product may be written, 


S(x) 


dx N-.Seen 


ru — 
ae ee 


? 


dM(x) _ _ 27M > >= sm, cos 2rjx/d sin 2rnx/d 


0 


>> synm,(sin 27(j + n)x/d — sin 2r(j — n)a/d). 


n=0 
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This represents a series of the form Ap + Aisin 27v/A + ... and we 
may rearrange by grouping the coefficients to obtain 


N 
S(2) o18) = i 2d DSpMy 


mM < = = j 
mca [> (Sip-nt — Spon SS, | sin 2rnx/d, 
where 6, = 1 when p = n, but is otherwise zero, and the series for S(x) 
is terminated for subscripts greater than 2N. Using this final series in 
place of the series for dM (x) /dx in the integral (3) for the demagnetizing 
field we obtain, 


N N 
H,(2) = > ih >> (Sin-pt — Snep + So spmghay cos 2rnx/d, (9) 
pel 
where the a, have the values calculated previously for the nonsaturat- 
ing case. There are now several conditions that the magnetization dis- 
tribution must satisfy: it has the value M(#) = M over the range 
—k Sx S R and satisfies the torque equation (6) outside this range, 
and finally, the amplitude of the applied field is such that M(a) 
determined from (6) has also the value W/ at x = +R. The required 
field value is given by the calculation for any particular drive strap 


! 
STRAP-TO-FILM 
AXIS DISTANCE 


EQUIVALENT LENGTH OF FILM, MiLS 





ie) 5 {0 1S 20 
HALF WIDTH OF STRAP, MILS 


Fig. 4— The change in circumferential component of magnetization averaged 
along the film is proportional to the signal obtained during readout. This is ex- 
pressed in terms of equivalent length of film that would produce the same signal 
when uniformly excited to saturation. The plots are derived from the axial com- 
ponent distributions of Fig. 2. 
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configuration. We now substitute the series (1), (2), and (9) imto 
the torque equation (6) and gathering coefficients, we obtain, 


Hxmo = ho for n=0 


and the set of N equations, 





So 
=~ Ann = hy , 


N 
Axm, + > oa (Sin-p| — Snap)Mp + 5 
n=1,2,---,N. (10) 


These N equations constitute a set of linear simultaneous equations in 
the N unknown coefficients m,. These equations may be expressed, 


Capp = hy; n= 1,2, 2°, 


p= 





NET EQUIVALENT LENGTH, MILS 











WIDTH OF REVERSED REGION, MILS 


Fig. 5 — Change in net equivalent length of film (proportional to output signal 
during NDRO), versus width of reversed domain established beneath drive strap. 
Curves a, b refer to strap half width of 1.0 mils, and strap to film axis distances of 
3.5 and 5.0 mils, respectively. Curves c and d refer to strap half width of 10.0 mils 
and strap to film axis distances of 5.0 and 10.0 mils, respectively. The curves are 
derived from the axial distributions of Fig. 2. 
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“X, MILS ALONG WIRE 


Fig. 6— (a) Theoretical curve and experimental points taken with the Kerr 
effect probe® for a saturated cylindrical film. The broken curve shows the relative 
fall off of the axial applied field. (b) The field is applied by a parallel drive wire 
arrangement shown in cross section. The current I applied in the drive wires is 
114A. 


where the Cy» are given by 


Cup = {Pes (Sinai — Sai) + (3 Qn = 114) 


Such a set of equations may be conveniently inverted by computer 
for any particular case giving the m, coefficients in terms of the h,’s. 
Since the m, and h,, coefficients are linearly related, a scale factor, e.g., 
current in drive strap, is applied to H(x) to ensure that the distribu- 
tion has a value Af at x = +R. The resulting series indicates a non- 
uniform distribution for J/(2) within the range —R S x S R, but, by 
the action of S(x), this produces no demagnetizing field and therefore 
does not influence the distribution obtained outside the range. The 
value of (f(x) is therefore set equal to M inside the saturation range. 
The plot obtained within this range reflects instead the value of 
(H — H,,)/Hx. 

Fig. 6(a) shows a plot of the axial magnetization distribution where 
the film has saturated over a length of 30 mils, for a cylindrical film 
of 5.2 mil diameter, 0.694m thickness and Hy = 3.1 Oe. The broken 
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Fig. 7— Axial component of magnetization for the cylindrical film of Fig. 6 
when driven to different degrees of saturation. 


curve of Fig. 6(a) shows a normalized plot of the applied field. The 
field is applied by a drive wire, and the separation between drive and 
return wire is 20 mils as shown in 6(b). The calculation indicates a 
current of 1.14 amps to produce this degree of saturation. The points 
represent measurements made previously® using the Kerr Effect probe. 

Fig. 7 shows the axial magnetization component for the geometry 
of Fig. 6 where the film has saturated to widths of 0, 10, 20, 30, 40 mils. 
The applied field is shown in Fig. 8, curve a, versus width of saturated 
region produced by the field. Curve b is for a drive strap of half width 
10 mils and strap to film axis distance of 10 mils. The shape of the 
curve does not appear to vary markedly with drive strap geometry. 
It can be noted that little increase in current is required to extend the 
saturated region from 1 to 10 mils, but that saturation to greater 
widths requires increasingly larger currents. 


V. FILM THICKNESS VARIATION 


Now let 7'(x) be the variable film thickness and assume that T(x) 
and H (x) have the same periodic distance A, then we may write 


Tay = 3 t, cos (Qrnx/n). 
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In the thin film approximation, magnetization variations within the 
thickness of the film are neglected and demagnetizing fields are cal- 
culated from the net pole density per unit area of film. To take into ac- 
count a variation in thickness we take the product T(x) M(x) as the 
total magnetization component in the hard direction and evidently the 
pole density is then given by — (d/dx) [T (x) M(x) ]. 

Taking the product of the series, we obtain 


M x 
T(x)M (x) = J {tomo + >> tm, 
p=0 


+ = mil tess ++ livssy) + to dn] COs one /r}, 


n=1 p=0 
hence, replacing M(x) by T(a#)M(x) in (8), the demagnetizing field 
is given by 


N N 
H,(t) = > YS milteen + tint + 082) cos 2anx/r, (11) 


n=0 2 p=0 


1.75 


1,50 
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Fig. 8—Current required to produce a given width of saturated region along 
a, cylindrical film of radius 2.6 mils, thickness 0.69um, Hx = 3.1 Oe. Curve a is for 
the arrangement of Fig. 6. Curve b is for a parallel conductor drive strap of width 
20 mils situated at +10 mils from the film axis. 
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where 62 = 1 when n = p but is otherwise zero. Substituting into the 
torque equation (6), and equating coefficients we have finally 


Hrimy = ho 
and 
od a 
> E (tntp + lin-pi + to 62) + 11482 |m, =hy we 12,-5.N 
p=0 
1.€., 
x a 
| 2 (tney oe tino! a ty dn) + 482 |, = ha =: Ont yho/H x . (12) 
p=1 


This last expression represents a set of linear simultaneous equations 
which may be solved numerically to give the coefficients m, in terms 
of ¢, and hy. The calculation, when applied to the case of a flat film 
strip having an ellipsoidal cross section along the hard direction, sub- 
ject to a uniform field, predicts a uniform demagnetizing field of 
magnitude very close to that indicated by the tables of Osborne® based 
on the solution of Maxwell’s equation for the general ellipsoid. Fig. 9 
shows the magnetization distribution near an edge of a uniform thick- 
ness (0.22 wm) flat film with Hy = 2.62 Oe. The points represent data 
taken with the Kerr effect probe. 


M(x) 


H, = 2.620€e 
T=0.22um 
UNIFORM FIELD 





ce) 20 40 60 80 
“&, MILS FROM EDGE OF FILM 


Fig. 9— Magnetization component near the edge of a flat film of thickness 
0.22um, and Hx = 2.62 Oe. The applied field is uniform and equal to Hx. The 
edge runs parallel to the film easy direction. The points show measurements 
taken with the Kerr effect probe. 
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Fig. 10— Axial magnetization component for cylindrical film segments of differing 
length due to the field from a parallel wire drive strap at distance +7.5 mils from 
film axis. Curves a, b, and ¢ refer to segments of length 40, 80, 160 mils, respectively. 
d refers to a continuous film. The current in the drive wire is 0.5 A. (b) shows a 
cross section of the drive wire arrangement. 


Fig. 10 shows, for comparison the magnetization distribution for a 
nonsaturating hard direction field applied to 5.2 mil diameter cylindri- 
cal film segments of differing lengths, but uniform thickness of 0.7um, 
and Hy = 3.0. The field is applied by a parallel drive wire arrangement 
of separation 15 mils. Finally, Fig. 11 shows the axial magnetization 
distribution for a uniform field applied to a cylindrical film having a 
circumferential cut. Film radius is 2.6 mils, thickness is 1.0 »m and Hy 
= 3.0 Oe. It is to be noted that the present technique has a spatial 
resolution limited both by the number of terms of the series that can 
be retained for computation, and by the basic limitation that exchange 
forces are neglected. We cannot, therefore, expect to obtain detail of 
magnetic behavior very close to an edge, for example, or for an ex- 
tremely narrow scratch. 


VI. ANISOTROPY MAGNITUDE VARIATION 


Let us assume that the anisotropy constant is represented by a 
cosine series, 1.e., 


K(z) = >> k, cos 2anx/n. 
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Fig. 11— Plot of axial magnetization component for a 5.2 mil diameter cylindri- 
cal film with a circumferential gap. The curves show the result for a 4 mil, 6 mil 
and wide gap. The axial applied field is uniform and equal to 3.0 oe. Film thickness 
is 1.0um and Hx = 3.0 Oe. 


Then substituting into the torque equation (6), and gathering terms 
we find 


N 
> a (ky + kode)m, = ho (13) 


p=0 


and 


N 
» ‘a (Ka+p a Wigesen ko 6;) + x, thm, =. 3 


n=1,2,---,N. (14) 


Together these equations represent V + 1 linear simultaneous equations 
in N + 1 unknown coefficients m,, and may be solved by computer. 
This calculation may be used for example to find the local behavior of M 
at the junction between two regions with differing anisotropy constants, 
or to find the effective permeability of a film having some systematic 
variation in anisotropy constant. A simplified discussion of this latter 
problem has been given previously.”° Fig. 12 shows the effect of using a 
high H, buffer region surrounding a normal Hx section of film. Curve a 
shows the distribution for a uniform wire with Hx = 3.0, b shows the 
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modification when H, is increased to a value Hy = 15 for all distances 
beyond z = 10 mils and c shows the result when Hx is further increased 
to 30 oe in the buffer region. The effectiveness of the high Hx buffer 
region in sharpening the distribution can be noted. This is achieved, 
however, at the expense of greater current required to just saturate at 
x = 0. For curves a, b, c the currents are 0.50, 0.79, and 0.93 A, re- 
spectively. Fig. 12(b) shows a cross section of the parallel conductor 
drive strap arrangement. 


VII. FIELD EXTERNAL TO FILM 


Combs and Wujek" have calculated the field external to a thin film 
rectangular slab assuming a pole distribution concentrated at the 
edges of the slab. We now calculate the field external to a continuous 
film subject to various applied field conditions where the details of the 
effective pole distribution form the essential part of the problem. The 
results of previous sections may be adapted to find the field external 
to films which have a hard axis variation in thickness or anisotropy 


AU MILS 
—F 5.2 
—Y MILS 


(b) 





M(x) 
M HK=3.00e T=104m 





a b Cc 
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Fig. 12 — (a) Effect of high Hx buffer region surrounding a normal Hx section of 
cylindrical film. Curve a shows the magnetization component for a uniform 
film with Hx = 3.0. Curves b and c show the result when Hx is increased to 15 and 
30 Oe, respectively for distances greater than 10 mils to either side of the drive 
strap centerline. (b) Details of drive strap arrangement. The currents required to 
just saturate the film at x = 0 are 0.5, 0.79, and 0.93 A for the cases a, b, and e, 
respectively. 
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but these cases are not considered in detail here. Consider the field at 
some distance d from the surface of a flat film and at distance x 
along the hard axis. The external field H,, parallel to the film due to 
the distribution of poles over the film surface may be found by evalu- 
ating the integral 


vine) (@@ — X)dX dYT 
H(t, d) = i: i Eas aaa (15) 
Substituting for M(x) and performing the integration we find 
2 0 
Hr,e(t, d) = ae > nme?" cos 2rnx/d. (16a) 
n=1 


This is the external field parallel to the plane of the film given as a 
function of distance d from the film. For a cylindrical film the result is 


H,,.(«, d) = —4raTM Sy (2= any K,(2rnd/n) 
n=1 
-I)(2xna/d)m,, cos 2rnx/d, (16b) 
where a is the cylinder radius, and d is the distance from cylinder axis 
to the location at which the axial component of field is measured, 
(d > a). The field inside the cylinder may be similarly derived, the 
result is 


H,.(«,d) = —4raTM D3 (2 a) K,(2rna/d)I,(22nd/d)m, cos 2rnx/d, 


where now d < a. Along the cylinder axis I9(0) = 1. Fig. 13 shows a 
plot of the axial component of the demagnetizing field for several values 
of distance from film axis. The cylindrical film is assumed to have a 
diameter of 5.2 mils, Hx = 3.0 Oe, thickness is 1.0um, and is excited by 
a one turn loop of radius 7.5 mils. 

The flux coupling a parallel wire loop parallel to a flat film surface 
and to the film easy direction with the conductors at +D from the 
surface may now be found. The flux F per unit length of the parallel 
conductor loop is then 


D 
F = 4eM(e)? —2 [ Hyde, 2) de. 
0 
Substituting for H,,., and rearranging, we find 


N 
F = 4cMT >> mie?" cos 2anx/d. (17) 


n=) 
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Fig. 13— External axial component of field due to the distribution of mag- 
netization along a cylindrical film. (The field due to the drive strap is not in- 
cluded.) The field is plotted along lines parallel to the film axis, at several dis- 
tances from the axis. The film has a thickness of lum, Hx = 3.0 Oe, diameter 
5.2 mils, and is subject to the field from a one turn circular loop of diameter 15 
mils. Curves a, b, and ¢ refer to distances of 2.6, 5.0, and 10 mils from the axis, 
respectively. 


If the magnetization distribution is due to the field from a parallel 
wire loop with conductors at +d from the film surface, then using ex- 
pression (8a), we have 


F(a) _ xCIM 4 QnCIM > en *” cos Qrnx/d 
4rT \Hx d Ax + An 


It can be noted that F(x)/4rT is formally equivalent to the magneti- 
zation component in the film at the plane of the loop due to a current 
I in a loop with conductors at +(D + d) from the film. The mutual 
inductance between two loops (not necessarily enclosing the film) may 
then be found directly from the above results. 

The flux linkage between the film and drive loop is obtained by set- 
ting « = 0 and D = d. A current I in the loop gives rise to a magneti- 
zation component M (0, I, d) at x = 0, and the flux linking the loop is 





(18a) 
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given by M(0, J, 2d), using (18a). The fractional flux linkage is there- 
fore M (0, I, 2d)/M (0, I, d). 
At x = 0, the expression (18a) may be evaluated in closed form; the 
result is, 
FO) _ 21C 
4nT = an MT 


Hence the fractional flux linkage (FFL) is 


FFL = exp (ud)E,(—2ud)/E(— ud), 
where » = 2Hx/4rMT and E; is the exponential integral. This is a 
useful parameter which shows the degree of coupling between loop and 
film, and is plotted in Fig. 14 as a function of d, for a flat film of thick- 
ness 0.lum, Hy = 4.0 Ce. 
The result for cylindrical films is more complicated. In this case it 
can be shown that 


F() _ 2CIrM 
4nrT nN 
3 ¢ 
: a2) x (222 (2222) x, (208) 1(2%22) cog een 
r d d r nN r 


MT (2rna\*_, (2 2 
no 0 es) ae) 





exp (2dH x/4arMT)E (— 2dH x«/4rMT) ' 











a 


FRACTIONAL FLUX LINKAGE 





Fig. 14— Fractional flux linkage between a flat film of thickness 0.lum, Hx 
= 40 Oe, and a pair of parallel wire conductors as a function of distance from 
film to the conductors. The parallel wire conductors serve as both drive and 
sense windings. 
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where the cylinder has radius a, thickness 7 and is excited by the field 
from a circular loop of radius d. F(x) gives the amount of flux picked 
up by a loop of radius D at an axial distance x from the drive loop. 


VIII. INTERACTION BETWEEN PARALLEL FILMS 


Consider two plane parallel films (denoted 1 and 2) of thickness 7 
and 7” and anisotropy fields Hx and H, respectively, separated by a 
distance w along a normal to the film’s surface. A nonuniform field is 
applied along the (parallel) hard directions by a drive strap. Let the hard 
direction fields be H(«) and H’(x). The field acting on film 1 due to the 
distribution within film 2 we denote by H(z),., and similarly the field 
acting on 2 due to film 1 is H(x),, . These fields are taken to act along the 
film’s common hard direction, and the films are assumed to be sufficiently 
thin that fields normal to the surface have negligible effect. 

The torque equation determining the local rotation of magnetization 
within the two films may be written 


Hx sin Ox) = H(x) + A(x) + Hi2(z), film 1 (19) 
Hysin 6’(2) = H’(x) + Ai (x) + He(x), film 2. (20) 
Let M(x), M’ (x) be the hard direction components of magnetization 


in the two films, then from previous sections we have (assuming, cosine 
distributions) 


H(z) = >> h, cos 2rnx/ 
H’ (x) >> AL cos 2rnx/d 
H,,(2) = —BT >\ nm, cos2rnx/\, H’,(x) = —BT’ >> nmi cos 2rna/d 
H,.(z) = —BT’ >> nmf exp (—2anw/d) cos 2rnx/d 
H.,(z) = —8BT >> nm, exp (—2rnw/)) cos 2rnx/d, 
where 8 = 47?M/A. Noting that sin 6(%) = M(x) /M and sin # (x) = 


M’ (x) /M, we substitute the above series into the two torque equations 
and equating coefficients, we obtain, 


I 


Hxym, = h, — BnT'm, — BnT' mi, exp (—2rnw/ es 
Hxm, = hi — BnT’m), — BnT'm, exp (—2rnw/)d) 
Solving for m, and m/ , we have finally 


_ _ BnT"h;, exp ( =2anwl® 
er [*. Hx + BnT" 





Bn°TT’ exp ten) | 1) 


| H, + pat = H ae BnT” 
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ie | x Gn exp (—2aae/) | 
_ a Hx + BnT 


Bn°TT’ exp (tenn) |" 

. , i EN ee 

E k + BnT Hy + Bn . (22) 
These expressions can be compared with the results when the films are 
present singly, i.e., at large separations, 


mn = (hn)(Hx + BnT)* 
ms = (hi)(Hk + Bn)”. 


Evidently the calculation can be extended to a greater number of layers 
and it is immaterial whether the drive fields are applied positively or 
negatively provided the fields are appropriately assigned, that is, the 
field may be generated by conductors located between or completely 
to one side of the films. The equations relating the coefficients m, , m/, 
may be concisely expressed in matrix form, 


AL ste] Lom] - eres armen] pO UiLm] = Lith 


(23) 


The three matrix terms of the left-hand side represent in turn the effect 
of anisotropy, demagnetizing field, and interaction between films. The 
extension to three or more films is straightforward. Fig. 15 shows the 
effect of flux closure between two films only 2 mils apart subjected to the 
field from a drive wire sandwiched between them. The films have equal 
thickness of 0.1 wm and anisotropy field Hx = 4.0 Oe. Since the fields are 
applied in opposite directions in the two films the demagnetizing fields 
tend to cancel and the magnetization distribution widths are smaller 
than for similar films well spread apart. Curve a shows the coupled 
distribution, and b shows the distribution with one film removed. The 
current required to just saturate the films is 0.127 A, with one film re- 
moved the current required rises to 0.170 A. With films of thickness 
1000 A, separations of order a few mils are essential for this effect to be 
appreciable. 

We may use the results (21) and (22) to examine the effect of a keeper 
layer. The action of the keeper is to modify the field applied to the film 
and to provide some degree of flux closure. Consider the case of a flat 
film situated between two drive wires, distance d from the film, with a 
keeper layer distance w > d from the film. Let primed quantities refer 
to the keeper, and unprimed refer to the film. The keeper typically has a 


DEMAGNETIZING FIELDS 1551 


TWO IDENTICAL 
FLAT FILMS 





Hn=4.0 Oe T=0.1pm 
FOR CURVES a,b 
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Fig. 15— (a) Effect of flux closure between two identical flat films, separated by 
a distance of 2 mils. The field is applied by a single wire placed between the 
films as shown in (b). The films have a thickness of 0.lum and Hx = 4.0 Oe. 
Curve b shows the result when one of the films is removed. The current required 
to just saturate the films at z = 0 now rises from the bifilm value 0.127 A to 0.170 
A for a single film. 


thickness of mils or tens of mils and hence 4%°MT"’/\ >> H;} for reason- 
able values of M and \. Equation (21) then reduces to, 


mM, = [h, — hh exp (—2rnw/)d)]/[Hx + BnT(1 — exp (—4rnw/d)]. (24) 


The field applied to the film in the absence of the keeper is H(x) = >) hy 
-cos 2xnx/i, where for the present case 


2C Ir 4CIx 


he= y ie a oe exp (—2znd/d). 


I is the current in the drive wires. The field applied to the keeper is 
given by >) h/ cos 2rna/d where hi = 0, 


2C Ir 
oe 


Then, mo = 2CInt/\Hx , and 
My, = CI(2r/d)(2 exp (—2rnd/r) — exp (—2rn(2w + d)/d) 
+ exp (—2rn(2w — d)/d)]/[Hx + BnT( — exp (—4anw/d))]. (25) 


hi, = {exp (—2mrn(w + d)/d) — exp (—2rn(w — d)/d)}. 
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It can be noted that the terms in the numerator are equivalent to the 
coefficients of the field due to the drive strap directly, and to images 
of the drive straps, with the keeper as mirror. The image property of 
the keeper layer is well known and has had considerable application 
to the discussion of keepers, see, for example, Refs. 12 and 13. The 
effect of the mutual interaction between keeper and film is to modify 
the a, factors (e@, = BnT for a flat film) by a term 1 — exp(— 47nw/d). 
The influence of this term is two fold, (7) the spreading of the mag- 
netization component is reduced and (2) the drive field required is 
reduced. 

Fig. 16 shows the effect of a keeper layer on the distribution in a 
flat film of thickness 0.2um, Hy; = 4.0 Oc. Field is supplied by a pair 
of drive straps of width 10 mils carrying a current of 0.22 A, at a dis- 
tance of 5 mils from the film. The keeper layer is taken to be 6 mils 
from the film. Curve a shows the hard direction component in the 
absence of the keeper, b shows the effect only of the image fields 
due to the presence of the keeper, and ¢ shows the final result when 
image fields and partial flux closure are taken into account. 
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Fig. 16—(a) Effect of a keeper layer on the magnetization distribution in a flat 
film of thickness 0.2um, Hx = 4.0 Oe. Field is applied by parallel drive straps of width 
10 mils at +5 mils from the film. The keeper layer is taken to be 6 mils from the 
film as shown in (b). Curve a shows the hard direction component in the absence 
of the keeper, Curve b shows the effect of the image fields only when the keeper 
is present, and Curve c shows the final result when image fields and flux closure 
are taken into account. 
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The effects of a flat keeper layer on the response of a cylindrical 
film are not- amenable to calculation by the present method owing to 
the mixed geometry. 

The case of a cylindrical film with a concentric cylindrical keeper 
is next considered. The discussion closely parallels that for flat films 
and leads to a result analogous to (24), 


125) ere ic arw) |p 
mM, =\|h,— ht, ——— \ || fz ta,ll — 
124) I (2m) K (222) 

nN L ON Dy ON 
where for cylindrical geometry a, = 41M(T/a)(2rna/d)*Io(2rna/d) 

-K,(2rna/d). The field is applied by a loop (of radius d) around the 
cylindrical film (of radius a), and h, , h’ are the Fourier coefficients of 
the field at the surface of the film and at the keeper (radius A), respec- 


tively. The axial field from a circular loop of radius d, at distance a from 
the axis and x from the plane of the loop, is given by’*’” 











(26) 





H(x, a) = or K® te #2 =* we |r +d +2°}, 


where K and E are complete elliptic integrals of the first and second 
kinds, respectively, and k? = 4da/[(a + d)? + 2?]. 

It can be noted that the effect of the keeper is to modify the applied 
field and to reduce the demagnetizing field. Fig. 17 shows a practical 
approximation to such a keeper geometry. Fig. 18 shows a plot of axial 
magnetization component in a Ipm thick permalloy film with Hy = 
3.0 Oe plated on a 5.2 mil diameter wire, subject to the field from a one 
turn circular loop of diameter 7.5 mils carrying a current 0.3 amps. 






_ KEEPER 
yo LAYER 


‘\_ DRIVE 
STRAP 


“A 
“CYLINDRICAL FILM 


Fig. 17 — A possible practical approximation to a cylindrical keeper geometry. 
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Fig. 18— Effect of a cylindrical keeper layer on the axial magnetization dis- 
tribution in a cylindrical film of thickness 1.0um, Hx = 3.0 oe, diameter 5.2 mils. 
Field is applied by a one turn loop of radius 7.5 mils. Keeper radius is taken to 
be 10 mils. Curve a shows the distribution with no keeper present, curve b shows 
the effect of the keeper in modifying the applied field, and curve c shows the final 
result when field modification and flux closure are taken into account. 


The keeper radius is taken to be 10 mils. Curve a shows the distribu- 
tion with no keeper present, b shows the effect of field modification 
alone when a keeper cylinder of diameter 20 mils is in place, and c 
shows the final result when field modification and flux return are taken 
into account. 


IX. NONUNIFORM HARD DIRECTION FIELD IN PRESENCE OF EASY 
DIRECTION BIAS FIELD 


In this case the torque equation has to be modified to include the 
easy direction field Hy (x), then 
2K sin 6(x) cos 6(~) = M(H (2) — H,,(x)) cos 0(x) — MHg(x)sin (x). (27) 
Providing cos 6 0, we may write, 
Hxsin 62) = H(z) — H,,(x) — He(x) tan 6(2), (28) 


where Hy = 2K/M and it is assumed that Hz is parallel to the easy 
direction component of magnetization. It is convenient to represent 
H;;(x) tan 6(x) by a series 
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N 
H,(z) tan 6(x) = >> d, cos 2anx/r. 
n=0 


Substituting into the torque equation (28), and gathering coefficients, 
we have 


(Hie + an), = hn — dh; n=0,1,2,---,N. 
The coefficients d, are now complicated functions of the m,’s and this 
equation cannot be solved directly. Instead we use an iterative proce- 
dure as follows: H(x) is given a peak value insufficient to produce 


saturation in the case Hy = 0 and then successive approximations are 
found for the m, coefficients. In the first approximation we take 


Ay 
Hx ae Qn 
tan 6(2) may now be found from sin 6(2) = M(x) /M, and the Fourier 


coefficients d, of the product Hz(x) tan @(#), may be obtained. In the 
next approximation, we take 


My = 


——— 
zy 
120 MILS 


| 
i} 
aS 





kk--20 --> 
M(x) Hy, =3.0 Oe 
M T=10pm 


I= 0.564 A 
BIAS FIELD = 1.0 Oe 








ce) 20 40 60 80 100 120 


Fig. 19— (a) Axial magnetization component for a cylindrical film with 
uniform easy direction bias field of 1.0 oe. The nonuniform hard direction field 
is applied by the drive strap arrangement shown in (b). In curve a, the bias field 
aids the rotation of magnetization for large x. A reverse domain is assumed to 
have been written into a width 20 mils, for s < 10 mils therefore the bias field 
opposes the rotation of magnetization. Curve b corresponds to zero bias field. 
Curve c corresponds to a reversal of bias field where it is assumed that the re- 
versed region has been erased. 
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gate In — dn 
‘ Hx + a 

We now find, as before, new coefficients d, ; hence, new coefficients m, , 
until the m, coefficients change by less than, say 5 percent per iteration. 
The curves for H(z) and M(x) are then plotted. The whole procedure 
may be repeated as necessary. The bias field may be a constant H, or be 
a step function changing from H,; to —H, at some location x = R. The 
step function corresponds to the case of a domain wall being present at 
x = R. The use of the step function provides a formal way of treating 
the modification to the torque equation, due to H, and the easy com- 
ponent of M being parallel for x < R, and antiparallel for « > R. 

It is to be noted that the torque balance becomes unstable for certain 
combinations of applied fields. The critical fields are related by [H(x) — 
H,,(x)]' + Hi = H}, where it is assumed that H, is antiparallel to the 
easy direction component of 1. This limitation does not apply when 
H, and the easy direction component of M are parallel. 

Fig. 19 shows a typical axial magnetization distribution for a cylindri- 
cal film, and corresponds to the procedure of “‘writing’’ into a region of 
film. A current in the plated wire produces a uniform easy direction bias 
field of 1.0 oe and an external drive strap produces a nonuniform hard 
direction field. The greater spread of the curve a compared with the 
zero bias field distribution [shown by curve b] is due to the bias field 
lowering the effective anisotropy to Hx — Hy, for rotations less than 
about 40°. The attempt to “erase” by reversing the bias field, curve 
c, raises the apparent anisotropy to Hx + Hy over much of the curve, 
and hence the film response is generally reduced. In curve ¢ it is as- 
sumed that the reversed region has been erased. It will be appreciated 
that the present calculation assumes at the outset that a domain 
wall has some given location. The resulting distribution must then 
be inspected to decide whether the location chosen was appropriate 
or even stable under the applied field. In a practical case, wall location 
is affected by additional factors such as dispersion and creep, and is not 
discussed further here. Experiments on flat films show that the reversed 
region is not totally erased by simple reversal of bias field. Fig. 20(a) is a 
Kerr effect picture showing a reverse domain of width 20 mils, written 
in by a bias field of 1 Oe and a peak drive field of 5.0 Oe (11 mil strap, 
10 mils from film). Fig. 20(b), shows the result of reapplying the fields 
with reversed bias. Fig. 20(c) shows the result of first demagnetizing 
the film into a fine domain structure, the width of the domain established 
is now much wider. In this case, the effect of the bias field changing the 
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—— DRIVE STRAP 


Fig. 20— (a) Kerr effect picture showing reversed domain (light) in a flat 
film written in by an 11 mil drive strap situated 10 mils beneath the film. (b) 
When bias field is reversed, the domain is not completely erased. (c) Width of 
domain written after first demagnetizing film with a large uniform hard axis 
field. (d) Shows the drive strap arrangement to the same scale. 


apparent anisotropy is much reduced, but the film now has an appre- 
ciable remanent state; hence, significant hard direction local demag- 
netizing fields exist in addition to the field introduced by the effect 
of the external fields. The relevance of such considerations to domain 
wall creep processes, under practical operating conditions, warrants 
further study but is not pursued here. 


X. CONCLUSION 


Demagnetizing fields play an important role in the operation of 
many thin magnetic film devices. The requirement of high packing 
density as in a memory, leads to strong localization of induced changes 
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in magnetization, and to correspondingly large demagnetizing fields 
and drive currents. 

In an open flux structure attempts to confine magnetization changes 
by using segmented films or high anisotropy buffer regions are suc- 
cessful only at the expense of a considerable increase in drive field 
requirement. To some extent flux keeper layers may be used to modify 
applied fields and to permit partial flux closure, with in consequence, 
both a lowering of drive currents and a reduced spread in induced 
magnetization component. 

The method of calculation given here permits a detailed examina- 
tion to be made of the effectiveness of such procedures, and has been 
applied to a variety of thin film demagnetizing field problems. Kerr 
effect probe measurements® are in good agreement with calculation 
although relatively little data is at present available. The results have 
particular applicability to cylindrical film problems, where axial varia- 
tion of field or properties is of primary concern, 
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Some Properties and Limitations of 
Electronically Steerable Phased 
Array Antennas* 


By D. VARON and G. I. ZYSMAN 
(Manuscript received April 4, 1967) 


This paper is a treatment on linear and planar phased arrays of current 
sources, whose amplitudes are uniform and scan-invariant. By recognition 
that the radiation impedance of an array element ts an analytic function 
of a complex scan variable, a powerful mathematical tool becomes avail- 
able for the investigation of some important properties of the impedance 
as a function of scan. For example, it is proven that in a finite array the 
impedance seen by such a scan-invariant current source cannot be per- 
fectly matched over a continuous scanning range using lossless, linear, 
passive and time-invariant elements. This result 1s extended to the infinite- 
array case by treating the latter as a pericdic structure, and assuming 
that the Green’s function of the unit cell is analytic with respect to the 
scan variable. The theory includes both linear and planar arrays. Among 
other results it is shown that the element impedance in an infinite array 
must be of a specific mathematical form. It ts hoped that by recognizing 
the limitations imposed thereby, useful guidelines will be established for 
achieving optimal match of an array into space. 


I. INTRODUCTION 


The class of antennas widely known as phased arrays includes es- 
sentially two types of radiators: stationary and steerable ones. The 
first operates at fixed amplitude and fixed relative phase between the 
array elements. Consequently, the antenna characteristics, such as ra- 
diation pattern, input impedance, and mutual coupling between ele- 
ments, remain unchanged during the entire operational lifetime of the 
antenna. The steerable antenna is characterized by time varying ex- 


* This work was supported by the U. S. Army under contract DA-30-069- 
AMC-333(Y). 
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citation. The relative phase between adjacent elements is varied either 
mechanically or electronically to bring about a variation in the orien- 
tation of the beam. In most instances scanned arrays are large in size 
and may contain several thousand elements. Their illumination has a 
linear phase taper. As a result the antenna characteristics become scan 
dependent. The relationship between scan angle and various param- 
eters of interest such as gain, element impedance, and mutual coupling 
between elements have been the subject of intense investigation in re- 
cent years.1:? One particular direction has been towards improvement 
of the impedance match over wide scanning ranges.* At present the 
merit of a matching technique can be determined only relatively to 
other techniques. To the best of the authors’ knowledge an absolute 
mathematical criterion, based on physical realizability requirements, 
has not been formulated. Some investigators*® claim that a perfect 
match of an infinite array for all scan angles (at which the active im- 
pedance is not infinite, zero or purely reactive) can be achieved by an 
infinite sect of interconnecting nctwork elements. However, the proof 
is based on the assumption that the scan-dependent equivalent load 
impedance at the array-space interface remains unchanged after the 
sources have been interconnected by coupling elements. Although this 
assumption has been successfully applied*:* to improve the matching 
capability of an infinite array, it is incorrect to use it in a perfect 
matching scheme. 

In this paper a new mathematical approach to phased array anal- 
ysis is presented. The model for the analysis is a phased array of ideal 
current sources (electric or magnetic) of scan-invariant uniform am- 
plitude. This model is further discussed in Section II. The analysis 
itself is based on the general laws of antenna theory and on those 
properties which are common to all phased arrays represented by the 
model. 

The first part of the theory is devoted to finite arrays and is treated 
in Section III. The starting point of the theory is a theorem which 
establishes that the radiation impedance of an element in a finite array 
is an analytic function of the scan angle. Further, it is shown that an 
element in a linear or planar phased array cannot be perfectly matched 
over a continuous scanning range by using lossless, linear, passive and 
time-invariant elements. Then it is demonstrated that the directions 
in space of the beams’ maxima are eigenvalues of a Laplacian differ- 
ential operator with periodic boundary conditions which are related to 
the phase taper of the array, and several useful properties of those 
eigenvalues are derived. 
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The second part of the theory appears in Section IV and is devoted 
to infinite arrays, which play an important role in the analysis of large 
phased arrays. The investigation is based on a transformation between 
the scan angle and a complex variable s = @ + 78, which can be in- 
terpreted on 0 <a <1, B = 0 as the trigonometric sine function of the 
angle between the plane of the array and the direction in which a chosen 
grating lobe propagates. It is subsequently shown that the element 
impedance, as a function of s, is restricted to a specific mathematical 
form. Recognition of the limitations imposed thereby may provide new 
insight into the behavior of such arrays. 


II. PRELIMINARY REMARKS 


The model chosen for the following treatment is a linear or planar 
phased array excited by a set of ideal current generators of uniform 
amplitude and linear phase taper. The description zdeal implies that 
the sources have no internal impedance and are invariant under any 
loading. This means that except for the relative phasing between con- 
tiguous generators the currents are scan independent. Frequently in 
antenna analysis induced currents are replaced by equivalent sources 
by application of the equivalence principle.® Such currents are not part 
of the sources. The induced currents are accounted for automatically 
by fulfillment of the requirement that the tangential component of 
the electric field has to vanish on all conductors. In general, the source- 
current amplitude in each element of the array may be a function of 
scan. However, this dependence is generally unknown and is often 
neglected in theoretical work. The types of excitations commonly used 
are the “free excitation” and “forced excitation”.* The first assumes a 
generator with a scan-invariant internal impedance which is capable of 
delivering scan-invariant incident power. In the latter a constant termi- 
nal voltage or current is maintained. As pointed out by Oliner and 
Malech free excitation is easier to realize in high-frequency technology 
than forced excitation. The latter, however, is more tractable here. 
The results of this study remain valid for scan-dependent excitation 
as well, provided the current density of the source is a smoothly vary- 
ing function of scan angle and can be analytically continued into a 
complex scan-angle plane. 

Under the assumption that the array is excited by a uniform ampli- 
tude and a linear phase taper, the current density excitation function 


* A, A. Oliner and R. G. Malech, Ref., 1, pp. 209-211. 
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of an M-element linear array (Fig. 1) is given by 
Joa — ma,y,2e"", masz (m+ la, 


=0,1,---,M—-1, 
Jz, 4,2, 0 = we (1) 


0 otherwise 


and that of an MZ Xx N element planar array of rectangular symmetry 
(Fig. 2) is given by 


Jo(e — ma, y — nb, zenntetnts) 


maSsxs (m+ lja, 


hegre 1)b, 
J(t, ys 2, ves Wy) = nbsys@t+) (2) 

=O eo ee. 

= 0: 1.9 oN ST. 


0 otherwise. 


The above currents can be either electric or magnetic the latter being 
regarded as equivalent to ideal electric voltage sources. 

Note that the spherical coordinate systems in Fig. 1 and 2 differ 
from those commonly used in phased array analysis. The poles are 
located at endfire instead of broadside and the ranges of colatitude and 
azimuth are such that the upper hemisphere is spanned by 0 S 6 S z, 
0 S ¢ < zm. This convention is chosen for reasons of mathematical 
convenience. The results derived in Section III are valid for linear 
as well as planar arrays. The inclusion of both cases in a single 
treatment is facilitated by a generalized notation for the current 
density excitation function. The steering phases my and mj, + 
ny, are replaced by an equivalent ‘“‘steering coefficient” o,,,(¢,,) in 
the plane of scan oriented at azimuth angle ¢,, . The steering coefficient 


: ** Joe J2? Joe 3¥ Ae Joe? Joe J24 ++ 


Fig. 1— Linear phased array. 
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Fig. 2— Planar phased array. 


is derived by its relationship to the direction of a beam’s maximum, 


which is determined for linear arrays by the equation 


Y + 2pr = ka cos 655 p= 0, +1, 42,--- + @ 
and for planar arrays by 


(3) 


Y, + 2pr = ka Cos 65, COS og p= 0, +1, +2,--- + © (4a) 
YW, + 2qr = kb cos 8,2 SIN Gyq q = 0, +1, +2, --- + ~, (4b) 
where k is the wave number in the medium, and 4,9 is as shown in Fig. 


1 and 2. The steering coefficient is then defined by 
Fmn(Pna) = K(MA COS Ypq + Nb SiN pq) 
DG = 0) cel eg etc es 
Equations (1) and (2) can now be written as 
Jo(a — ma, y — nb, 2) exp (jomn COS 45), 
masxzxs (m+ da, 
nbsSyS(n+1)b, 
ple M1, 
i ee ene 


J(@, Ys 2, 9%) = 


Il 


oO Oo 


m 
n 
0 otherwise 


at Ynq = const. 


(5) 


(6) 
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Under the above generalization the excitation function for the 
linear array becomes a special case, g = 0, N = 1, ¢,, = 0, and the 
period in the y-direction extends from — © to +; or alternatively 
p = 0, M = 1, ¢,, = 7/2 and the period in the z-direction extending 
from — © to +. Since the phase constant exp {jomn(Gpq) COS Oyo} 
is independent of (p, q), any 6,, may be chosen as the independent 
variable of scan. The subscript pq will henceforth be omitted whenever 
the mathematical expressions are independent of (p, q). 

The time dependence e’*‘ is assumed throughout the analysis. In 
a steerable array the phase taper is time dependent. However, it is 
understood that the rate of change of the phase taper is very small 
in comparison to the angular frequency, i.e., d¥/dt « w, since only 
under that condition do the classical concepts of directivity and radia- 
tion impedance remain meaningful. If Y(¢) is a step function it is as- 
sumed that the time interval is long enough to allow all transients 
to reach negligible values before a new step is initiated. 

The formal solution of the array problem is obtained from Max- 
well’s Equations via a vector potential A(x, y, z, 6) which is a solution 
of the inhomogeneous reduced wave equation 


we 


VATKHA = —pJ(e, y, 2, 9), (7) 
where p is the permeability of the medium. The magnetic field is given by 


and the electric field (under Lorentz gauge) by 
E = ~ja(a + is vv-a): (8b) 


The solution to (7) over infinite space V can be written in closed 
form in terms of a dyadic Green’s function’ 


Aj, yz, =n | Sey 2l&m9-IE ms, )adnd, (9) 


where G(x, y, 2 | & m, §) is a solution of 

ag 86, oe oe - 

we t yet git 8 = lee — psy - dae -9, 0) 
I being the unit dyadic a,a, + a,a, + a,a,. The boundary conditions 
which G has to satisfy are derivable via the Vector Green’s Theorem* 


* P.M. Morse and H. Feshbach, Ref. 7, p. 1767. 
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by imposition of the requirement that the tangential component of 
the electric field has to vanish on all conductors. This guarantees that 
all induced currents are accurately determined. 
It can be shown’ that the average complex power delivered by the 
mnth element in the array is 
I 


| eg = E-Ji, dv, (11) 
2 Vann 


where 


Jnn(z, ¥, 2, 0) = J@,y,2, 4), 
masxs (m+ ila, nbsys(n+ 1b (12) 


the asterisk (*) denotes complex conjugate, and YV,,,, is a simply con- 
nected volume occupied by Jin. If Sn is a surface obtained by taking 
a cross section through V,,,, the total current, J,,,, flowing through 
the cross section S,,, is 


Tee [[,_ F-as. (13) 


The element radiation impedance, Zn», 1s defined in terms of the com- 
plex power by 


| ee = 3 | ee |? Line . (14) 


By (10) and (13) via (8b) and (9), the element radiation impedance 
ean be defined directly in terms of the array geometry and the excita- 
tion: 


1 = 
Z mnt 9) = Foal? a (Be Ye, 6) Ga, Y,@ | é, Un é) 


“J, 2, 6, 0) drdv, (15a) 
where dr = dédndé, dv = dxdydz, and 


: ay S| 
Giz,y,2/&70= jou(T + 73 VV) Se, y, 2|& 7, 6). (15b) 


Operator V operates on (2, y, 2). The quantity |In.|* is introduced for 
the purpose of normalization, and may depend on the choice of the 
cross section Sinn. 

The definition of the impedance includes both linear and planar ar- 
ray elements. It is consistent with the commonly known definition of 
impedance? if the latter is viewed as a relation between the average 
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complex power delivered by the generator and the rms current flowing 
into the load. The definition given by (15) is necessary in view of the 
fact that in a system excited by distributed currents, a terminal volt- 
age in the time domain is not always uniquely defined. In a system 
excited by magnetic currents, (15) defines the element admittance if 
the permeability » is replaced by the permittivity e« and the electric 
currents by their magnetic counterparts. 

In the following theoretical discussion, it is assumed that the phased 
arrays are excited by a uniform amplitude and a linear phase taper. 


HI. FINITE ARRAYS 


Theorem 1: The element radiation impedance in a jinite, steerable, linear 
or planar phased array of scan-invariant current sources, radiating into 
a linear, lossless, passive and time-invariant system, is an entire function” 
of the scan angle 0 in any given plane of scan, with an essential singularity* 
at@— o, 


Proof: By (15a) 


Tee Pn = ff Tae, 1.2, Gee, y, 218 1, 0G yf, 0) ard. 

(16) 
On expanding (16) in a double sum of integrals over all cells {(m, n)}, 
m=0,1,-°::,M—1;n=0,1,---,N — 1, and using the relationships 
of (6) followed by a change of variable in each term of the sum, one 
obtains 


M-1 N-1 


ZO) = Dy Dd) Emare EXP [F(Omn — Fre) COS 4], (17a) 


m=0 n=0 


where 


ime = TEL | Be we 
Bax + ra,y + sb,z2|& + ma, yn + nb, O-Jo(E, 2, 2 dr dv. (17b) 
In any given plane of scan ¢ is constant, so that 
Omn — Ore = k[(m — rha cosy + (n — s)bsing] = on-pn-s (18) 


is independent of 6. Both cos 6 and the exponential function are entire 
functions.t Consequently, the exponential function appearing in (17a) 


* R. V. Churchill, Ref. 10, Sec. 68, p. 157; Sec. 112, p. 270. 
1 R. V. Churchill, Ref. 10, Sec. 21, p. 47; Sec. 23, p. 50. 
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is an entire function of an entire function, which is likewise entire’ 
(entire functions are also called integral functions). Z,s;() is a finite 
sum of entire functions and is also entire. 

The nature of the essential singularity at 6 > oo is obtained by first 
expanding cos 8 in the complex 6-plane 


cos (6, + 70;) = cos 6, cosh 0; — j sin 6, sinh 6; . (19) 


Then, if | 6; | — © in such a way that (6,,. — ¢,.)0; > 0, the mnth 
term behaves as exp {| nn — ore | sin 6, exp [| 4; |]} Q.E.D. Note 
that even when Jo(z, y, z, 0) is sean dependent, Z,,(@) is analytic pro- 
vided J,(x, y, z, 8) is analytic. However, other isolated singularities 
may exist. 


Corollary la: Re {Z,,} and 2 Re{Z,,} are entire functions of @ each 
with an essential singularity at 6 > ». Proof appears in Appendix A. 


Theorem 2: The power radiated by an element in a finite, steerable, linear 
or planar phased array of scan-invariant current sources, radiating into 
a lossless, linear, passive and time-invariant system cannot be kept con- 
stant over a continuous scanning range with lossless, linear, passive and 
teme-invariant network elements and scatterers only. 


Proof: Let G(a, y, z| £, 7, © be the dyadic Green’s function of the 
entire system including all equalizing elements. The radiation impedance 
of the mnth element of the array is given by (15a) for a lossless, linear, 
passive, time-invariant system. If the array is radiating constant power 
over a continuous scanning range, the real part of the radiation im- 
pedance, F,,(0) = Re{Z,,}, must remain constant in that range and 


IIA 


< POS. . see oso (20) 


where 6 = 0, + j6;. By Corollary la, = [R,,(0)] is analytic in the 


closed 6-plane and has an essential singularity at 6 — «. However, 
if the derivative vanishes along the line 6, S 6, S 62 it must vanish 
everywhere in the @-plane*. Hence, it cannot have an essential] singu- 
larity at infinity. The contradiction implies that F,,(0) cannot be 
constant over a continuous scanning range. Q.E.D. 

Equations (3) and (4) specify the directions of the beams’ maxima, 
however, not all of them correspond to real directions in space. Whereas 
Ypq 18 real for all (p, g), Op, can be either real or imaginary, as may be 


* P, M. Morse and H. Feshbach, Ref. 7, Vol. I, p. 390. 


1570 THE BELL SYSTEM TECHNICAL JOURNAL, SEPTEMBER 1967 


seen from the solution of (4): 


-1 (vy, + 2en)a 


(v te 2pr)b : 05%. <7 (21a) 


Poq = tan 


0,q = COS” ea! os* Yom 208 0< 6,, 57. (21b) 


ka@ COS Qpq kb sin ¢,, ’ 


If 6,, is real it is said that the beam is in real space. By way of mathe- 
matical generalization it is said that all those beams having an imaginary 
6,, are in “imaginary space’’. If 6,, = 0, or 6,, = 7, it is said that the 
beam is in a grazing position between real and imaginary space. It 
can easily be verified from (21) that for a given phasing (y., ¥,) every 
pair (p, q) corresponds to a unique direction (y,,, 9,,) in the complex 
domain 0 S$ ¢ < z, 0 S Re{é@} S zw. These directions are the char- 
acteristic directions of the system. They are directly related, through (4), 
to the eigenvalues of 





oF , oF j _ 
ape at ee (22) 
with the following periodic boundary conditions 
F(x, y) = Fw + a, y) exp (—j¥), (28a) 
oF oF , 
op Or) = 3B, + Gy) exp (—I¥e), (23b) 
F(z, y) = F(a, y + b) exp (= 94); (23¢) 
oF oF 
ay (r= Gy sy + D) exp (— i). (23d) 


The eigenfunctions, which form a complete orthogonal set in the 
intervalO S$ x2 Sa,0S y S bare 


F(x, y) = exp Ee + 2pm) z| exp ie + 2qr) y| ; 
p,g=0,4+1,42,---4 oo, (24) 
By (4) they can also be written as 


F(x, y) = exp {jk Cos 0, (% COS Ypg + Y SIN Gp) }. (25) 


The eigenvalues {Tq} are 


T,, = k cos 6,, p,g=0, +1, 42,°---4 0, (26) 
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The results thus derived lead to several interesting conclusions which 
are summarized in the following lemmas. 


Lemma 1: Every steerable linear or planar phased array with a linear 
phase taper has only a finite number of beams in real space. Proof appears 
in Appendix B. 


For every pair of phasing (¥,, W,) there exists an infinite set of 
characteristic directions {6,,, ¢,,}. As the array is scanned by varying 
the values of (¥,, y,) in the intervals -r7 S ¥, S 7, —TrSYy,87 
some characteristic directions will go through a grazing position going 
from imaginary to real space or vice versa. We shall call such char- 
acteristic directions ‘transitive characteristic directions’’.* Since the 
condition for a grazing position is | cos 6,, | = 1, it follows from Lemma 1 
that the number of transitive characteristic directions is finite. 


Lemma 2: The radiation impedance of an element in a linear or planar 
phased array can be expanded by an infinite series over all characteristic 
directions of the system. Proof appears in Appendix C. 


IV. INFINITE ARRAYS 


In analyzing large arrays it has been found useful to approximate 
the behavior of the center elements by the behavior of identical ele- 
ments in an infinite array of the same geometry.’* This approximation 
is motivated by the fact that the performance of the center elements 
is strongly affected through mutual coupling by contiguous elements, 
but very weakly by elements far away.7® 

The formulation of the infinite array problem may be obtained from 
the results derived for finite-size arrays by letting the number of 
elements M and N approach infinity. The infinite array problem can 
also be treated as a periodic structure by application of Floquet’s 
theorem. In the following, the latter approach is adopted, but first it 
is demonstrated that both methods are consistent. 

The electric field of an infinite array as given by (8b) must satisfy 
the same periodicity conditions as the source function, 1.e., 


E(x + ma, y + nb, z) = E(e, y, 2) exp [jom(y) cos 6]. (27) 


* Note the distinction made between “grazing position” and “transitive charac- 
teristic direction”. A beam associated with a transitive characteristic direction 
may attain a grazing position for a particular phasing, but may also point in other 
directions. 
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On the other hand, the electric field 


E(x, Y, 2, 6) = -f G(x, Y,é | g, qy O-Jé, Ny 0) dr (28) 


can be expanded in an infinite sum of integrals using the relationships 
of (6): 


E(x, y,2, 8) = — Dd) 2) exp (jomn cos 8) 


M=—O N=—0O 


[Ge u2 1+ ma, a tnd, D-Sub, 1 8) dv, 29) 
where Vo. is the volume occupied by J, . Define a new Green’s function 


Go(w, y, 2 | 8 1, 9) 
= DY exp Gm cos NEG, 2 [E+ ma, a+ nd, 2) BO) 
and notice that 
Go(w, y,2|€ + Ma, 1 + Nb, 9) = exp (—joun cos 8)Go(x, y,2 |&, nt 
since by (5) a 
Om+MineN = Omn + oun - (32) 


From (27) and (81) it follows that Go(x, y, 2 | & n, £) can be expanded 
by the eigenfunctions (25) as 


Go= DD Goal | S)Foalt, WFAE, 0), (33a) 
where 
a Eof*® 7? = 
dna = a5 ff Goals PAC, 0) de dy dé dy. (BBE) 


Substituting (80) via (33a) into (15a) for the center element, m = 
n = 0, one obtains 


Zoo = > > 2pa (34a) 


p=—0 g=-0 


where 


m= Teh. bem 
 Ira(Z | OP RG: nF Ak, Y) Jo, Up 9) dr dv. (34b) 
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Equation (34) is an alternate representation to (86) for the radiation 
impedance of the infinite array element and it demonstrates that 
Lemma 2 is valid for infinite arrays as well. _ 

By substituting the new representation for G,, (30), (33), into (29) 
and noting that the electric field satisfies the homogeneous reduced 
wave equation in the source-free region, one obtains for the unbounded 
space 


PCI De SS Dei eae Geil (ei StaG 


p=-0 G=—-0 


where 


Yoo = VIR, — hk’ = jk sin O54 (35b) 
ee ~[ Joa? | O-JolE, n, OF ACE, 9) dr, (35¢) 


and dax is the projection on the z-axis of the largest distance between 
two points on the surface enclosing Vog. It can be seen that the electric 
field in the source-free region, above the central area of a large array 
may be approximated by a finite number of homogeneous plane waves 
propagating in the real characteristic directions, and an infinite num- 
ber of nonhomogeneous plane waves, exhibiting exponential decay in 
the direction perpendicular to the plane of the array. The latter are 
interpreted as waves propagating in the imaginary characteristic di- 
rections. 

In an infinite array all elements are embedded in an identical en- 
vironment, and therefore the power radiated by each element is the 
same. There is no net power flow into a unit cell through the “side 
walls”. Consequently, the quantity Re{|Io|?2pq} of (34b) is equal to 
the power propagated by the plane wave (p, g) within a unit cell in 
the direction perpendicular to the plane of the array. By Lemma 1 
there is only a finite number of plane waves with transitive charac- 
teristic directions (see footnote p. 1571). Let them be distinguished 
from all other plane waves by assignment of the subscript (p, g) = 


(7, v). 


E,, &,,F,,(2, y) exp (—jk [2 | sin Ory ? | z | > Amax (36) 
H,, = BC (2; y) exp (—jk | 2 | sin Ges) | z | > Opise , (37) 
where F,,,(z, y) is given by (26), and &,, by (35c). If 


D,, = jk[cos 6,, COS %,,8, + Cos 6,, Sin y,,a, — sin 6,,a,] (88) 
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then 


Re, = L Der X En. (39) 


The power radiated by a (r, v) en wave per unit cell into the upper 
hemisphere is 


P., =5Re | is (E,, X H4)-a, de dy. (40) 


Substitution of (36) through (39) into (40) gives 


sin 0% 


P,, = © sin eal | &,,-a, |? + | &,,-a, |? + = sin 6;, | Ss F| (41) 


where 7) = (u/e)*. From (41) a radiation resistance per wave is defined as 


Py 7 
| Lo k 


Since the entire system is passive and lossless, then by conservation 
of energy, the power P,, must originate from the element itself. Hence, 


R,, = Re {z,,}, 6,, real, (43) 


where z,, is given by (34b). 

From (41) it follows that when a wave (7, v) is in real space R,, 
is real, and when it is in imaginary space R,, is imaginary (in which 
case Re{z,,} = 0). Hence, of all the elements comprising the source’s 
load, R,, appears either resistive or reactive, depending upon the 
scan angle. Such properties of a load, which are unknown in lumped 
network theory, are a consequence of the losslessness postulate. When 
propagation is possible power is carried away from the source. When 
propagation is inhibited there is no net loss of power and the load 
must be reactive. By Lemma 1 only Ref{z,,} has those properties. 
All other z,,, (p, g) # (7, v) and Im{z,,} always retain their dis- 
sipative or reactive characteristics. Further, there is only a finite 
number of terms having Ref{z,,} > 0. In practical phased arrays the 
spacing between the elements and the scanning range are such that 
only one such term exists at a time. 

The following two definitions summarize the properties described 
above: 





R,, = (42) 


Definition 1: An O-type network function is a scan-dependent immit- 
tance (impedance or admittance) which is seen by the source as resis- 
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tive when the beam is in real space and as reactive when the beam is 
in Imaginary space, and it behaves like an open circuit for impedance 
and like a short circuit for admittance in the grazing position. 


Definition 2: An E-type network function is a scan-dependent immit- 
tance (impedance or admittance) which remains either resistive or re- 
active when the beam passes through the grazing position. 

The motivation behind the nomenclature introduced by the two defi- 
nitions will become clear later, in Theorems 4 and 5. The O-type and 
E-type immittances are of distinct mathematical form. To arrive at it 
consider first the following transformation: * 


S = SIN Onn (44) 


X = COS Ginn ; (45) 


where (m, n) is one particular transitive characteristic direction out 
of all (7, v). Given s and x all other characteristic directions are 
uniquely determined. By (4) 


v, = kaxV1 — 8? — 2mr (46) 
¥, = kbhV1— x V1 — 8 — Qae, (47) 


where (1 — x’)! = 0 for all possible x and (1 — s”)' > Oif0 < 0,,, < 7/2, 
and (1 — s’)! < 0 if 7/2 < 6,, < m. Then by substitution of (47) 
into (22) all other characteristic directions are found: 


_ kab — x’)? — 82)? + 2(¢ — n)xa 


20 foo =~ kabx(L — 8)! + 2p — mynd oe 
cos ae = foo(S), (48b) 

where 
f(s) = kax(1 — s)’ + 2(p — mr (48c) 


ka COS Goq 


This suggests that when characteristic direction (m, 7) is scanned in a 
plane x = const, each of the components z,, of the total input impedance 
as given by (34) can be expressed as a function of the same variable s. 
The conformal mapping between the @,,,-plane and the s-plane is shown 
in Fig. 3. In view of the branch cut —1 S @ S 1 it will be understood 
that s = a denotess = a — j/0if0 S 6,, S 7/2 ands = a + jOif 
w/2 S On, S 7. Let s = s,, be the value at which characteristic direc- 


* Recall that 6m. and gma are not in the conventional spherical coordinate sys- 
tem (sce p. 1564). 
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Im {@mn} 
Cc’ 





Fig. 3— Conformal mapping s = sin @mn. 


tion (7, v) is in grazing position. At this value 
fr(Sr) = 1. (49) 


Of all values {s,,} there is at least one which satisfies (49) for s,, = 0. 
From (48) it is obvious that f?,,(0) = 1, and there may be other transi- 
tive characteristic directions (7, v) ¥ (m, n) which attain their grazing 
positions at s,, = 0. 


Theorem 3: In an obstacle-free space, the impedance function 2,,(8), 
associated with the characteristic direction (p, q), ts an analytic function 
of the complex variable s = a + 98, with branch points at s = s,, and an 
essential singularity at |s| > ©. If (p, qg) = (m, n) then Zm,(s) may 
have a simple pole at s = Snn = O 


Proof: The general definition of z,, is given by (84b) in which the 
6,, dependence is contained in the Green’s function component 
Goa(2 | OF AE, 0)F p(x, y). The Green’s function is derived from (10) 
via (15b). Green’s function G(z, y, z | & 7, ¢) satisfies the same periodic 
boundary conditions as G,(z, y, z | £, 7, ¢) and can be expanded 1 in a 
series similar to (83a): 


f= DY Gel OFate, WFAG, 2). (60 
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By substitution of (50) into (10) and use of the orthogonality property 
of F,,(x, y) one obtains a differential equation for C,,(z | £) 


°C ae 
dz = ¥pqCnal2 | D = —-I Ke D 





(51) 
You = jk sin 6,, 
with the additional requirement that as | z|— ©, C,, behaves as an 


outgoing or evanescent wave. The solution of (51) for free space is 


Coe] o) = 1 exp {jk |z2— ¢|sin A} (52) 


i 
: 2jabk sin Opq 


Jra(Z | £) is obtained from Coa(Z | ¢) through an operator ®,, : 


Goal2 | 0) = jure Cle | 0), (53a) 
where 
Gy = TAD Dy (58b) 


D,, being given by (88). Substitution of (52) into (58a) followed by 
substitution into (34b) gives 


eer scl 0 RTOS * 
“pe 2abk | Loo ie sin Goa Voo Voo Ji (2, 2) 


‘Roa Joes 0, §) exp {Je COS Opq[(~ — £) C08 %pq 
+ (y — 0) sin Go] — jk |2 — ¢ | sin O,} dr dv. (64) 
The integrand is an entire function of 6,, with an essential singularity 
at |Im {6,,} | — «©. Hence,* if J,(x, y, 2) is piecewise continuous, 
the integral is also an entire function with the same essential singularity. 
By (48) 

sin Inq = [1 — foa(s)]*. (55) 

By Lemma 1, 
foals) #1 if (p,q) ¥ (7,»). (56) 


From Fig. 3 it is readily seen that |s| < © when |6,,| < © which 

implies, via (48), (55) that | cos 6,, | < © and | sin 6,,| < © as long 

as |s| < o. Thus, the singularities introduced by the transformation 

(44), (45) are the branch points at s = s,,. Also if (p, g) = (m, n)T 
*H. J. Copson, Ref. 11, Sec. 5.5, pp. 107-109. 


7 Recall that (m, n) is “the characteristic direction which defines the transforma- 
tion from (pz, wy) into (s, x), (44)-(47). 
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2p»q may have a simple pole at s = 0. (Note, for example, that for hori- 
zontal polarization, Jo = a,J,, there is a simple zero in the plane of 
scan corresponding to ¢n, = 0,ats = 0.) Q.E.D. 

The above proof can be applied separately to the real and imaginary 
parts of the right-hand side of (54). If z,, = R,,(s) + 7X,,(s), then R,,(s) 
and X,,(s) are analytic functions of s, real on the real axis of s, with 
an essential singularity at | s|— ©, branch points at s = s,,, and 
possibly simple poles at s = 0. 

In systems other than obstacle-free space, the normalized complex 
power z,,(s) has different forms. Except for isolated values of s, the 
radiated power and the stored energy per unit cell are bounded and 
continuous functions of s over those portions of the real and imaginary 
axes of the s-plane which have physical meaning. Hence, it is reasonable 
to postulate that an analytic continuation of z,, as a function of scan 
can be made into a region of the complex s-plane which includes por- 
tions of both the real and imaginary axes. It may be of interest to 
note that the impedance function z,,(s) derived by L. Stark’* for the 
planar dipole array over a ground plane is analytic. The regularity 
of z,.(s) depends directly on the regularity of g,.(z | ¢; s). The singu- 
larities of z,, in the s-plane are determined by the boundary conditions 
which g,,(z | ¢; s) satisfies. 


Theorem 4: An E-type immittance function V(s) is an even function of s. 


Proof: Let the complex variable s be defined with respect to the transi- 
tive characteristic direction (m, n). Once (m, n) is chosen, the proper 
branch of (1 — s”)* in (48) is uniquely determined. Let (k, 1) denote 
all other transitive characteristic directions which reach their transitive 
position simultaneously with (m, n). Formally, this implies 


f(0) =1 = (7, ») = (m,n), (k, J. (57) 
As a consequence of Definition 2 and Lemma 1, V(s) is recognizable as 
Vi) = ae (p,q) # (m,n), (k, 0), (58) 

X,(s) all (p, 9), 


where Ryq(s) + jXpq(8) = 2pq(S), Zpq given by (54). Thus, (58) estab- 
lishes the connection between the defined E-type function and physical 
quantities corresponding to R,q(s) and X,q(s). Consider Definition 2 
which states 


V(s) — V#(s) = 0 s 
Vis) — V*(s) = 0 S 


I 


ea O<e<l, (59a) 
jB. (59b) 


STEERABLE PHASED ARRAY ANTENNAS 1579 


Since V(s) is analytic and also real on the real axis of s, (59) may be 
rewritten as* 


Vis) — V(s*) = 0 sS=a 0<a<l, (60a) 
V(s) — V(s*) = 0 s = 76@. (60b) 
On the real axis 
V@) — V@ =0. (61a) 
On the imaginary axis 
V(j8) — V(—J6) = 0. (61b) 


By analytic continuation? of (61b) from the imaginary axis to a point s 
in the complex plane one obtains 


Vo) — V(—s) = 0. (62) 


Hence, V(s) is an even function of s. Q.E.D. 


Theorem 5: An O-type immittance W(s) is an odd function of s. The 
proof is similar to that of Theorem 4 and it appears in Appendix D. 


It has been shown in Theorem 2 that a finite phased array cannot 
be perfectly matched over a continuous scanning range. The proof is 
limited to finite arrays and cannot be directly extended to infinite ar- 
rays since the representation of the element impedance by (17a) does 
not guarantee convergence in the complex 6-plane if the limits of the 
summations are extended to infinity. In treating the infinite array, the 
element impedance is derived by symmetry considerations from which 
it is concluded that the net complex power radiated from each element 
is conserved entirely within the unit cell of that element. It has been 
shown that the two definitions are consistent. Although the problem of 
whether an infinite array can be perfectly matched is of academic in- 
terest only, it is worthwhile noting that as for finite arrays, the answer 
in this case is also negative. To show this the reader may recall that 
the impedance has been defined as normalized power and postulated to 
be an analytic function of the scan variable s = a + 78. The normaliza- 
tion constant is |Zoo|? given by (13). If the complex power as a func- 
tion of scan is represented by 


PS) = | Loo |? [RO + iX()I, (63) 





*P, M. Morse and H. Feshbach, ee a Bp uee I, p. 3938. 
7 Morse and Feshbach, Op. Cit., p.3 
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then by Lemma 1, the term R(s) is a finite sum of analytic functions 
of the complex variable s. Consequently, R(s) is an analytic function 
of s. In general, it may be represented as 


R@) = E® + 0), (64) 


where (s) is an even function of s and O(s) is an odd function of s. 
Under conditions of perfect match over a continuous range, constant 
power, P, , is radiated over that range. Since R(s) is analytic it implies 
R(s) = P, | Ioo |"? everywhere in the s-plane. Since a constant is even, 
O(s) = 0. Further, H(s) must have a branch cut on the real axis of the 
s-plane in the interval [—1,1]. But the branch cut does not exist if 
E(s) = P, | Io |~”. The contradiction implies that P(s) in (63) cannot 
equal a constant over a continuous range of s. 


Theorem 6: The resistance and reactance functions of an element, or their 
derivatives, in an infinite linear or planar phased array of current sources 
are discontinuous when a grating lobe ts in a grazing position. 


Proof: In an infinite array the grating lobes are plane waves propagating 
in the characteristic directions. By Theorems 4 and 5 the element 
impedance Z(s) in an obstacle-free space can be written as 


Zs) = P(s) + eo. (65) 


For real values of s, P(s) is an even complex function of s bounded at 
s = 0, and Q(s) is an even real function of s nonzero at s = 0. On the 
real axis of s 


Ha) = Pt) + 22. (662) 
On the imaginary axis of s 
2(j8) = PCa) — 5 EP. (66) 


A grating lobe is in its transitive position at s = 0. The pole discon- 
tinuities are established by showing that 


Re HintZe) = tw ZGa- a OO ts (67a) 
a0 B--0 a0 Qa 

Palin ier Ga nn 208) ers (67b) 
a0 B-0 p--0 
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The pole discontinuity has to be interpreted as an invalid mathe- 
matical solution at the transitive position. It is a result of the idealiza- 
tion introduced by the concept of an “infinite array.” If Ri»(s) has a 
simple zero at s = 0, as is the case when a horizontally polarized array 
is placed above a ground plane, then the active impedance in the 
neighborhood of s = 0 can be written as 


Z(s) = R(s) + JX), (68a) 
where #R(s) and X(s) are real functions of s (real for s real). 


Rk) = pz a;s* (68b) 
X(s) = > bos". (68c) 


When the beam whose transitive characteristic direction is in real 
space, $ = a 


R, = R@ = d> aa’ (69a) 
7=0 
and when it is in imaginary space, s = 78 
= Re {R(jp)} => 1)‘a.,8"". (69b) 


The discontinuity in the derivative of the resistance is 





— lim = = a. (70) 
Similarly, the reactance 


X, + X@) = dX} dyn” (71a) 
7=0 


X_ = Im {Z(j8)} = > (1) [BoB + aor+:8?*7] (71b) 


£2 GX io dX g 
ae da ‘a dp" 2) 
The proof can be generalized for any order algebraic singularity or 
zero ats = 0. For example, if there is a zero of multiplicity N the dis- 
continuity will be in the Nth derivatives of the resistance and react- 
ance. A noninteger order zero yields a discontinuity after a sufficient 
number of differentiations. Q.E.D. 
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Vv. SUMMARY AND CONCLUSIONS 


A new mathematical approach to phased arrays has been adopted 
to investigate and discover various properties of the radiation imped- 
ance of an array element as a function of scan angle. The underlying 
idea of the method is the treatment of the impedance as an analytic 
function of a complex scan variable, which enables one to prove that 
an array element subject to the model chosen cannot be perfectly 
matched over a continuous scanning range by using lossless, linear, 
passive and time-invariant elements. 

The first half of the theory is devoted to finite arrays. It is shown 
that the directions (in space) of the beams’ maxima are eigenvalues 
of a Laplacian differential operator with periodic boundary conditions, 
which are related to the phase taper of the array. It is proven that 
there exists only a finite number of real eigenvalues. The known con- 
cept of imaginary space is then adopted to accommodate the imagi- 
nary eigenvalues, Furthermore, it is demonstrated that all beams except 
a finite number are completely confined either to real space or to 
imaginary space, and that only a finite number of beams may attain 
a grazing position. The unique properties of the latter beams have 
been found to play an important role in the investigation of infinite 
arrays, to which the second half of the theory is devoted. 

The interest in infinite arrays, apart from its academic aspect, stems 
from the good approximation it provides for the behavior of the cen- 
ter portion of a large finite array. It has been found that the infinite 
array element impedance as a function of scan is restricted to a spe- 
cific mathematical form. It is the authors’ hope that recognition of the 
limitations imposed by that form may provide useful guidelines in 
achieving optimal match of an array to space. 
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APPENDIX A 

Proof of Corollary la 

Corollary 1a: Re{Z,,} and < Re{Z,.} are entire functions of 6 each 


with an essential singularity at @—> ©. 
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Proof: Denoting 


Zanes = Pmnrs + Pia (73) 
one obtains from (17a) 


Re{Z,,(6)} = R,,(8) 


a » Ss | Paiies COs [Binnrs(9) | — Xmnrs sin [Binnrs(9)]}, (74) 


m=0 n=0 


where 
Brnre(9) = (Omn — Tre) COS O (75) 
and 
Fs) M-1 N-1 
59 Fere(8) =e 2 2 Bruel O)) Dukes sin [Bnnrs(9)] 


+ Xmnrs COS [Binnre(9)]}. (76) 


Since cos @ is an entire function of 6, cos[ Byars (0) ] and sin[Bynrs (6) | 
are entire functions of an entire function, and are therefore entire. The 
existence of the essential singularity can be demonstrated in a similar 
fashion to that in Theorem 1. Q.E.D. 


APPENDIX B 


Proof of Lemma 1 
Lemma 1: Every steerable linear or planar phased array with a linear 
phase taper has only a finite number of beams in real space. 


Proof: A beam (p, q) is in real space if | cos 6,,| S$ 1. Dividing (4a) by 
ka and (4b) by kb, squaring and adding, one obtains 


(Yeh 2pe)! 5 (ee Pee) <1 (77) 
or 
(¥: 4: ) (* ) 4 (ve fe 1) () eee ss 


Necessary conditions for the above inequality to be satisfied are 


ys 


ik 


ore 1 (79) 








: 
r 
in one Bs (80) 


bes 
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Since 
|p| s 24 . (81) 
lal sets (82) 
Hence, both p and q are bounded. Q.E.D. 


APPENDIX C 


Proof of Lemma 2 


Lemma 2: The radiation impedance of an element in a linear or planar 
phased array can be expanded by an infinite series over all character- 
istic directions of the system. 


Proof: The current density excitation function of a finite-size array 
given by (1), (2) satisfies the periodic boundary conditions (28) in 
the finite domain occupied by the array. Let this domain be denoted 
by D. The current density can, therefore, be uniquely expanded in D 
in terms of the eigenfunctions (25): 


J@,y,2) = Ue, y, D) De De jna@)Fna(, Y), (83) 
where 
a b 
in@ =a ff Iles, IFAC, ») de dy (84) 
: ab Jo Jo 
and U(x, y, D) is a two-dimensional unit step function 
Ue, y, D) = VE ae ee (85) 
0 otherwise. 


Substitution of (31a) into (15a) yields 


Biss apy. ar ae (86a) 


pFH-OQ q=-a 





ooo = EEL ff Hey) Ge, ye 16 0D 
“jnal)Fou(€, UGE, 1, D) dr dv. (86H) 
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APPENDIX D 


Proof of Theorem 5 
Theorem 5: An O-type immittance JW(s) is an odd function of s. 


Proof: Let the complex variable s be defined with respect to the tran- 
sitive characteristic direction (m, n). Let (k, 1) be all other transitive 
characteristic directions which reach their transitive position simul- 
taneously with (m, n). Then as a consequence of Definition 1 and 
Lemma 1 


Ws) = Ry(s) = (p, ) = (m,n), (k, J), (87) 
where Ryq(s) = Re{2pq}, Zpq given by (54). Thus, (87) establishes the 
connection between the defined O-type function and a physical quan- 
tity corresponding to Iy(s). From Definition 1 


Ws) — W*(s) = 0 SS a 0<a<l (88) 
Wis) + W*(s) = 0 8 = 5p. (89) 
Since W (s) is real on the real axis of s, (88), (89) may be rewritten as 
W(s) — W(s*) = 0 S=a (90) 
W(s) + W(s*) = 0 s:= 498. (91) 

On the real axis 
We) — We) = 0. (92) 


On the imaginary axis 


W(38) + W(—38) = 0. (98) 
By analytic continuation of (93) from the imaginary axis to a point 
s in the complex plane one obtains 


W(s) + W(—s) = 0. (94) 
Hence, W(s) is an odd function of s. Q.E.D. 
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An Energy-Density Antenna for 
Independent Measurement of the 
Electric and Magnetic Field 


By W. C.-Y. LEE 


An energy-density antenna which can measure both the EL field and H 
jield of a plane wave simultaneously has been developed, consisting of two 
small orthogonal semiloops over a ground plane. Hybrids were used to take 
the sum and difference of the loop outputs, giving voltages uniquely pro- 
portional to the E and H fields. The loop dimensions and optimum con- 
figurations were experimentally determined by measurements at a frequency 
of 8386 MHz in a man-made free-space environment. Energy-density com- 
putation from the measured E and H fields of a standing wave in free space 
showed that the maximum-to-minimum range of the energy density 1s much 
less than that of etther the E or Hf fields alone. 


I. INTRODUCTION 
A new way of reducing the signal fading encountered on a mobile 
radio transmission path is being investigated.t One source of fading 
is due to the fact that plane waves propagating in opposite directions 
at the same frequency produce a standing wave with nulls in the elec- 
tric field every half free-space wavelength. The magnetic field also 
has nulls like the electric field but displaced a quarter wavelength 
from the electric field nulls. The electromagnetic energy density of 
such a pure standing wave is constant. If we sample # and H in free 
space and amplify the signals by the appropriate relative gains, 
square and add them, we obtain a signal proportional to electromag- 
netic energy density 
w = 3(cH” + uH’). (1) 
The resulting output would be constant as we move through this 
idealized standing wave pattern. This method of energy-density uti- 
lization may be helpful in overcoming the rapid fading due to motion 
through the more complicated standing wave patterns in the mobile 
radio electromagnetic field. To utilize the energy concept, we need an 
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antenna that has three outputs independently proportional to the 
field components H,, Hz, and H, at any point in the field (assuming 
vertical polarization). Since neither the ordinary loop antenna nor the 
shielded loop antenna can be used in this particular case, an investiga- 
tion was undertaken to develop a suitable antenna. 

This paper describes a particular antenna* which satisfactorily meets 
these requirements. The antenna consists of two small orthogonal loops 
and will be described later. Measurements on such an antenna and 
several other comparable ones were made in a simulated free-space 
environment. 


II. METHOD OF TESTING THE PROBES 


. First of all, we need a method of test which tells us how well the 
antenna is responding to the H field alone. As mentioned before, the 
nulls of the # and H field in an ideal standing wave pattern are \/4 
apart. Therefore, if we can establish such an ideal pattern, the F nulls 
can be located accurately by a whip antenna; then the positions of the 
H nulls are known. Then we can test the magnetic probe in this en- 
vironment, looking for nulls at these H-null positions. 

A conducting ground plane 16 feet X 3 feet was surrounded with 
commercially available absorbers (minimum absorption is 17 dB one- 
way) to provide a man-made free space. Two waves traveling in 
opposite directions were produced by exciting two identical trans- 
mitting antennas from a common source. These two transmitting 
antennas “S” and “N,” approximately 12 A apart, were A/4 whip 
antennas operating at 836 MHz over the ground plane as shown in Fig. 
1(a). The receiving antenna under test could slide in a slot about 2 A 
long which is in between the two transmitting antennas. E fields were 
first tested separately from the two transmitting antennas in order to 
make sure that the reflections in the man-made free space were small, 
and that the individual fields were sensibly constant along the length 
of the slot. The two curves shown in Fig. 2 are the amplitudes of the 
signal from each of the transmitting antennas. The field from the “N” 
antenna had a maximum-to-minimum variation range of about 2.5 dB, 
and that from the “S” antenna a variation range of about 3.5 dB. 


*A brief description of this antenna appears in two papers: (1) Theoretical and 
Experimental Study of the Properties of the Signal from an Energy Density 
Mobile Radio Antenna, presented at the IEEE Vehicular Communications Con- 
ference on December 2, 1966, in Montreal, Canada. (2) Statistical Analysis of 
the Level Crossings and Duration of Fades of the Signal from an Energy Density 
Mobile Radio Antenna, B.S.TJ., 46, February, 1967, pp. 417-448. 
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Fig. 1— (a) Experimental set-up. (b) Energy-density antenna—double orthog- 
onal loop antenna. 


These variations, due to residual reflections, were felt to be acceptable. 
Since the average amplitudes of signal strength of two transmitting 
antennas were not quite the same, 11-dB attenuation was put on “S” 
antenna, and 10 dB on “N” antenna in order to get a good standing 
wave. The peak-to-null value of the standing wave produced when 
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Fig. 2— Amplitude of signal strength along the slot receiving from one trans- 
mitting antenna only. 


both transmitting antennas were excited was then 23 dB, as shown in 
Fig. 3. We should remember that the measured standing wave was ob- 
tained from two E fields. Then we know a standing wave of the H 
field exists which will have the same peak-to-null value but a \/4 
shift from the standing wave of the EF field. 


III. TYPE OF ANTENNAS TESTED 


3.1 Single-Einded Loop 


A semiloop with one end grounded and the other end as output can 
be used as a magnetic field probe. However, the size of the loop is 
critical. Large errors are obtained in measuring the magnetic fields 
unless its diameter is less than 0.01 » (about 0.14 inch diameter at 
836 MHz).? 


3.2 Double-Einded Loop 


A semiloop with two output ends can be used as a combined electric 
and magnetic probe. If the double-ended loop is in the field of a plane 
wave, the sum of the two outputs of the semiloop is proportional to the 
E field, and their difference to the H field. If the plane of the loop is 
in line with the direction of propagation the output is proportional to 
the total H field, otherwise only to a component of H field. This would 
be a limitation in using this type of probe for general purposes. 


3.3 Two Orthogonal Loops 


This antenna has been proposed for receiving a linearly polarized 
wave coming from a remote source which may not necessarily be in 
line with the plane of the loop. It consists of two double-ended loops 
with their planes perpendicular to each other. The “orthogonal loop 
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antenna” has two pairs of outputs. Adding two pairs of outputs sep- 
arately gives two values which should be identical and expressed 
theoretically as proportional to the total # field. Subtracting two pairs 
of outputs separately gives the two components of the H field. These 
two components are the components along the rectangular coordinates 
which have been defined by the planes of the two loops. The orthogonal 
loop antenna is an electric and magnetic field probe which appears to 
be promising for probing the energy density of the total field. Hence, 
it is called the energy-density antenna. 


3.3.1 Connected Loops 


The two loops are electrically connected at the top point. Since this 
configuration can allow the two loops to be identical, the two values of 
E field obtained from the two loops are expected to be equal, the cur- 
rents in the two loops are correspondent to the two components of the 
H field which are normal to the planes of two loops. However, the 
connection at the top points is not exactly at the middle, which may 
introduce some errors. 
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LE. 3 — Standing wave along the slot by using a whip antenna as a receiving 
probe. 
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3.3.2 Unconnected Loops 


The two loops are not connected electrically at the top points. In this 
configuration the two loops cannot be identical. One loop must be bent 
at the top in order to disengage the top point from that of the other 
loop. Therefore, the H field obtained from two loops may be different; 
also the two H component fields. However, the current in one loop 
may not be affected by the other due to the fact that the two loops are 
unconnected. 


IV. EXPERIMENTAL RESULTS 


4.1 Single-Ended Loop 


The standing wave along the slot was measured by using different 
sizes of the single-ended loop. Investigation of three loops, 1, 1.5, and 
2 inches in Fig. 4 shows that a 1.5-inch loop is better than the other 
two. The nulls of H field of the 1.5-inch loop are located more like the 
true H field though the amplitude of H field is2 dB less than the 2-inch 


CENTIMETERS 


S 


I 
! 
| 

t 
j 
I 
! 


DECIBELS 
ZS: 


a 


SEMILOOP OF 2.0" DIA. 
' ——— SEMILOOP OF 1.5” DIA. 
—-— SEMILOOP OF 1,0" DIA. 


SarzSmesss 





20 
iA 3/4X v/2 A/4 0) \/4 /2 3/4d 1X 


Fig. 4— Standing wave of H fields along the slot by using a semiloop as a 
recelving probe (one end output). 
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loop. Comparing the 1.5-inch loop with the 1-inch loop, the amplitude 
of the 1.5-inch loop is 3 dB higher and the nulls are still located slightly 
better than in the 1-inch loop. Hence, the 1.5-inch loop is chosen even 
though it is 1/36 2 off the true H field (the standing wave of the true 
H field should be exactly a \/4 shift from the H# field). This was due 
to the effect of the electric field. The 1.5-inch loop is approximately 
0.1 4 in diameter. This size of the loop was selected and used in the 
other types of loop configurations. 


4.2 Double-Ended Loop 


The standing wave along the slot was measured by using a semiloop 
as a receiving probe. The two outputs from the 1.5-inch semiloop were 
connected to a hybrid ring where the sum port gave the F field, and 
the difference port gave the H field. Since the plane of the loop was in 
line with the two transmitting antennas, the H field was a total H field. 
The # field and the H field outputs are shown in Fig. 5. The first null 
of the H field on the right had a slight disturbance which was prob- 
ably due to the imperfect free space. 


4.3 Two Orthogonal Loops (unconnected) 


This probe consisted of two semiloops 1.5 inches in diameter. The 
size of the loop was chosen from Fig. 4. The circuit arrangement is 
shown in Fig. 1(b), except the top points of two loops were not con- 
nected. 


4.3.1 45° Orientation 


A double orthogonal semiloop was tested at an orientation of 45° to 
the line between the two transmitting antennas. Fig. 6 shows the two 
components of H field: H, and H». The two H components should be 
equal since the two loops were oriented 45° to the axis. However, the 
two loops, due to the fact they were roughly hand-made, were not 
precisely 45° to the axis. They were also not connected at the top 
points. So the fact that H» was higher from loop 2 than H,; was from 
loop 1 was not a surprise. Fig. 6 also shows the # fields from the 
two loops, and we note that the nulls of the E field from loop 2 were 
lower than loop 1. The difference between the two loops was that 
loop 2 had more cross section area than loop 1. 


4.3.2 90° Orientation 


A double orthogonal semiloop was oriented at 90° to the line be- 
tween the two transmitting antennas. In this case, He should equal 
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Fig. 5— Standing waves of F and H fields along the slot by using a semiloop 
as a receiving probe. 


the H field and H, should read zero output. From Fig. 7 we see that I, 
is 18 dB down compared with H», but has apparently picked up some HE 
field since the peaks of H; are almost located at the nulls of H» and 
vice versa. H» in Fig. 7 is almost equal to the vector sum of the two 
components, H; and Ho, in Fig. 6 (45° case) as one would expect. FE, 
and EK. in Fig. 7 should be identical. They both represent the Ff field. 
In an ideal situation, #, and EH». in Fig. 7 and in Fig. 6 should all be 
the same. Since the two loops were not connected at the top point, the 
maximum output from loop 1 was slightly lower than loop 2. Hence, 
the nulls of the four Z’s were not the same. 


4.4 Two Orthogonal Loops (connected) — Energy-Density Antenna 


The two orthogonal loops (1.5 inches in dia.) were connected at the 
top point of two loops, shown in Fig. 1(b). 


4.4.1 45° Orientation 


A double orthogonal semiloop was oriented at 45° to the two trans- 
mitting antennas. Fig. 8 shows the two components. H, and Ho. Since 
the loops, due to the fact they were roughly hand-made, were not 
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oriented precisely 45° to the axis and were not actually quite sym- 
metrical to the center, the two components H, and H. were not equal. 
There was no remarkable difference between Fig. 8 and Fig. 6. Fig. 8 
shows the two FH fields: #, and H,. Their peaks are almost the same, 
which might be due to the fact that the two loops were connected at 
the top points, but the nulls did not coincide with each other due to 
the two unsymmetrical loops. Comparing Fig. 8 and Fig. 6, we found 
that we had better results when there was a connection at the top 
points of the two loops in that the nulls of £; were somewhat deeper. 


4.4.2 90° Orientation 


A double orthogonal semiloop was oriented at 90° to the two trans- 
mitting antennas. Fig. 9 shows H» which is the amplitude of the total 
H. Loop 1 picked up some # field, as H, shows, of about the same value 
as in the unconnected case. H, was almost 20 dB down compared with 
H,. There was no remarkable difference between Fig. 9 and Fig. 7 
except that H, in Fig. 9 picked up more like a pure # field although 
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Vig. 6—Standing waves of H fields and £# fields along the slot by using a 
double orthogonal semiloop antenna unconnected at the top point (oriented 
at 45°). 
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it has small field strength. Fig. 9 shows that EZ, and EH, almost coin- 
cided, but in Fig. 7 they did not. Hence, it is better when the two loops 
connect at the top points than when they do not. 


V. ENERGY-DENSITY COMPUTATION 


We used the H and £ components of two connected orthogonal loops 
oriented at 45° (Fig. 8) and 90° (Fig. 9) to compute two sets of 
energy density from the measurements made in the free-space environ- 
ment. Since both H and H were measured in volts, the energy density 
we computed from (1) is 


v= §(@ +(e) 
z 5 (E*(wolts” /m*) + (377 ohm)? X H?(amp’/m’)] 
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Fig. 7—Standing waves of H fields and E fields along the slot by using a 
Bess orthogonal semiloop antenna unconnected at the top point (oriented 
at 90°). 
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Fig. 8— Standing waves of H fields and & fields along the slot by using a 
double orthogonal semiloop antenna connected at the top point—an energy- 
density antenna (oriented at 45°). 


ze 5 LE*(volts/m)" + H*(volts/m)’] 


= 5 w’), (2) 
where 
H’ — aH? +- a,l ) 
EB’ = EF? or Ei, 


a = a weighting factor (a factor relating the level of average peak 
values of H, and H, components to the FH field), and 


w’ = the energy density in our calculation. 


From Fig. 8 we found that the maximum value of H, was about 
1 dB less than He. Also from Fig. 9 we found that the maximum value 
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Fig. 9—-Standing waves of H fields and E fields along the slot by using a 
double orthogonal semiloop antenna connected at the top point—an energy- 
density antenna (oriented at 90°). 


of either of two EF fields was about 2 dB less than H». Hence, we might 
suggest the following equation representing the energy density obtained 
from this particular antenna: 


l 


w!’ = (1.122H,)? + H? + (1.26E,) 


= (1.26)°[(0.89H,)? + (0.795H.) + E32], (3) 


where a; = 0.89 and az = 0.795. From (8) we can calculate two energy- 
density curves, one shown in Fig. 10 for the orientation of antenna at 
45° and another also shown in Fig. 10 for the orientation of antenna at 
90°. From both curves, the maximum-to-minimum range was only 
about 2.4 dB, compared to 18-20 dB in Fig. 8 and 9 for the # and H 
fields alone. 
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Fig. 10 — The energy-density calculation of an energy-density antenna (oriented 
at 45° and 90°). 


VI. CONCLUSION AND COMMENTS 


An energy-density antenna with loops of 1.5 inches in diameter was 
selected from the measurements as the one to test in the mobile radio 
field. The connected orthogonal loops were somewhat better than un- 
connected ones. For two orientations of the loop in the standing wave 
field in the test environment, the computed energy density varied 
much less than any of the field components. The configuration of the 
energy-density antenna could be used at other frequency ranges by 
sealing the diameter of the loops. After an energy-density antenna was 
made, a calibration to obtain the weighting factors a; and a2 was 
needed to set up a proper energy-density equation for this particular 
antenna. 

I wish to take this opportunity to thank W. C. Jakes, Jr., for his 
advice and suggestions. 
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Error Probability for Binary Signaling 
Through a Multipath Channel 


By R. T. AIKEN 


(Manuscript received February 27, 1967) 


Error probability ts considered for binary signaling through a multipath 
channel in which (2) the receiver observes a waveform comprising white 
Gaussian noise and the sum of (perhaps several) time-delayed, frequency- 
shifted, Rayleigh-faded versions of the transmitted waveform, (it) the 
recewer decides with minimum error probability which of the two possible 
transmissions was sent. Results given herein for the exact minimum error 
probability necessarily depend upon a number of parameters and are 
cumbersome to use. By introducing bounds on the error probability, de- 
pending upon bounds on spectra of certain matrices, the number of param- 
eters 1s reduced and the less cumbersome results become applicable to any 
one of a set of channels rather than to just one channel. The error-prob- 
ability bounds are presented in terms of values of the distribution function, 
derived herein, of the difference of two chi-square random variables. The 
bounds are sharp when the spectra are narrow. For the case of widely 
orthogonal signals, any version of one possible transmission being orthogonal 
to any version of the other transmission, the bounds are given as a set 
of universal curves plotted versus signal-to-noise ratio for various values 
of the number of paths and of the spectral width of certain matrices. Spectral 
bounds can easily be computed when the versions for each transmission 
are nearly orthogonal. Returning to the general case, another bound is 
derived, by a technique due to Chernoff, which does not explicitly require 
spectral bounds which may neither be readily available nor be accurate 
approximations of eigenvalues. This bound ts not as sharp as the previous 
bound for the case of small spectral width, but has promise for the large- 
width case. 


I. INTRODUCTION 


This paper considers error probability for the optimum reception 
of binary signals transmitted through a multipath channel having 
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P paths.* One of two possible signals is transmitted; the received 
waveform is the sum of P Rayleigh-faded, time-delayed, frequency- 
shifted versions of the transmitted signal, plus white Gaussian noise. 
That is to say, if the complex signal ~/ 2L,, z,, (t) is transmitted, m = 1, 2, 
the contribution to the received waveform from the pth path is 


Un(t; p) = V2Eq dptn(t — 7p) exp [4Orf,t + ¢,)]; 
where dp, ¢p, tp, and fy are the Rayleigh-distributed amplitude, the 
uniformly-distributed phase, the fixed time delay, and the fixed fre- 
quency shift associated with the pth path. The received waveform is 
P 


Zm(t) = D> Yalt; p) + n(t), 


p=1 
where n(t) is white Gaussian noise. 

The above multipath situation is a special case of a more general 
communications situation in which a receiver observes a sample z(t) 
of a zero-mean complex Gaussian process on the time interval [O, T], 
the covariance function (z(s)2*(t))m having been selected from a set 
of two distinct functions by chance according to the prior probabil- 
ities {am}, m = 1, 2, and the other second-moment function (2(s)2(t))m 
being zero. The receiver is to be designed so that its decision 
upon one of the two possible hypotheses is made with mini- 
mum average error probability P., where Pe = SanP.(m) and 
P,(m) is the probability, when covariance indexed m is true, of 
deciding otherwise. 

The receiver-design problem has been treated in Ref. 1, rigorously 
demonstrating that optimum processing involves quadratic filtering. 
However, the filter kernels, being the solutions of integral equations, 
are difficult to determine in general; moreover, the error probability 
is not evaluated. For the multipath channel, the first difficulty is over- 
come in Ref. 2 and the evaluation of binary error probability is 
considered in the present paper. 

Section II presents the theory of a method that can be used to 
calculate error probability exactly. However, it is quickly appreciated 
that error probability depends in a cumbersome fashion upon a 
large number of parameters including the path strengths and the 
scalar products of the versions. To simplify this situation, this paper 
introduces bounds on the error probability which depend upon bounds 
on the spectra of certain matrices, the eigenvalues of which determine 


* Hach path could comprise a multitude of randomly phased subpaths having 
essentially the same delay and frequency-shift parameters. 


ERROR PROBABILITY 1603 


error probability exactly. Thus, the bounds are applicable to any one 
of a set of channels rather than to just one channel. 

Section III presents these error-probability bounds in terms of 
values of the distribution function of the difference of two chi-square 
random variables and then derives this distribution function. More 
specific results are obtained in Section IV for the case of widely 
orthogonal signals, any path’s version of one of the two possible trans- 
mitted waveforms being orthogonal to any path’s version of the other 
waveform. Here, easily computed spectral bounds can be given for 
the case in which the versions under each hypothesis are nearly 
orthogonal. Section V considers the case of well-resolved paths, making 
contact with diversity theory (Ref. 3, Chap. 7), and the case of on-off 
keying. 

The error-probability bounds considered above require spectral 
bounds which may not always be easily computed and which may not 
be accurate approximations of the eigenvalues. A bound that circum- 
vents these difficulties is obtained in Section VI with a technique due 
to Chernoff. Comparison of this bound with previous bounds is car- 
ried out analytically only for the case of well-resolved paths, but 
qualitative comparison is made for more general cases. 


II. PROCEDURE TO OBTAIN ERROR PROBABILITY IN THE GENERAL CASE 


2.1 Notation 


The binary situation is a specialization of the case of M-ary 
signaling through the multipath channel in which the received process 
z(t) can have one of M possible covariance functions, (2(s)2*(t))m, 
m= 1,2,... A, of the form 


= 
26, >, Fpb,(s, m)b*(t, m) + Nod(s — 2), 

p=l1 
a degenerate kernel plus a white-noise kernel (Ref. 2). Here b,(t, m) = 
exp (227f,t)x,(é — 7,) is a time-doppler-shifted normalized version of 
the transmitted signal ~/2E,, 2,,(t); the path with index p has an 
average cross section of o, units, a delay of 7, seconds, and a doppler- 
shift of f, Hz. We put 


fan P= f at}, m F =, 


so that the average energy received from the medium is 
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sf dou, > o, | b(t, m) 2 = En oy = En, 


since we put >> o, = 1. 
The above covariance function can be written 
2E,,b(s, m)ob*(t, m) + N,d(s — 0, 


where b(t, m) is a vector with pth component b,(¢, m) and a is the diagonal 
matrix with pth entry o,, with tro = 1. 

The optimum receiver decides according to the value of m that 
corresponds to the largest of M test statistics computed as follows. 
For each value of m, the receiver first generates the column vector 
Z(m) = Nj? f dt 2(é)b*(t, m) and then evaluates a test statistic com- 
prising a Hermitian form in Z(m) plus a bias constant. This test statistic 
is 


[(No/2E,)'Z(m)]' 2B n/N) H(m)[(N o/2E,)'Z(m)| + (No/2E,) a(n), 
where the Hermitian combining matrix is 
(2E,,/No)H(m) = (2E n/N.) [2H ,/No)B(m) + o "J", 
the bias is given by, 


Am det [(2E,/No)B) + a] 
a, det [(2E,,/N.)B(m) + a "]’ 


B(m) is the correlation-function matrix f dé b*(t, m)b(t, m), and the 
hypotheses are ordered so that HZ, = max E,,. The above test statistic 
is obtained from that given in Ref. 2 by subtracting log [a, det™* o 
det~* H~*(1)] and multiplying all resulting terms by N,/2/, . 

The above test statistic has a certain intuitive appeal. The components 
of the vector Z(m) are the correlations of the received signal against 
the noise-free versions of the transmitted signal that would occur 
when message m is sent. That is to say, Z(m) provides a measure 
of the projection of z(é) on the P-dimensional subspace spanned by 
these versions. Moreover, the test statistic is a measure of the likelihood 
that this P-dimensional subspace is in fact the correct subspace. Then 
the optimum receiver strategy is decision according to the most likely 
of the M possible subspaces. Also, since P dimensions are involved, 
it might be anticipated that the results are related to the case of P-fold 
diversity, cf. Section 5.1. 

Henceforth, only the binary case, J = 2, is considered. In this 
case, decision according to the larger of two test statistics is equivalent 





6(m) = log 
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to decision according to the sign of their difference. The decision events 
can then be written in terms of one Hermitian form in a composite 


Gaussian vector 


we v./2n,)'(22) 


Z(2) 
as follows. Let 
2H 
V. H(1) Opxp 
Q=}; ° 
2k, : 
Opxp Ne H(2) 


where Opxp is the p X p zero matrix. Then the receiver decides upon 
m = 2 when Z'QZ is less than (V,/2/,)6(2), and decides upon m = 1 
otherwise. 

The conditional error probabilities are thus 


P,Q) = Pr {Z'QZ < (N./2E)6 | 1} = PJ (X)o _ 


P,(2) = Pr {Z'QZ > (N,/2E)0 | 2} =1— r] (Ba)o 


where H = E,, 6 = 6(2), and F,,(x) is the distribution function of 
Z'QZ conditioned upon the mth hypothesis. 


2.2 The Fundamental Matrices 


Since Z'QZ is a function of a Gaussian vector, the distribution 
function F(x) is determined by the conditional mean, (Z),,, which is 
the zero vector, and by the conditional covariance L(m) = (ZZ*),,, 
the other second-moment matrix (ZZ),, being the 2P X 2P zero matrix. 

The conditional covariance matrix L(m) is evaluated as follows. Let 


L(m) = ee 25 


L'*(m)  — L”*(m) 


where L"(m) = (N,/2E,)(Z(j)Z*(k)), . Then, by the definition of Z(j) 
and interchange of operations, we obtain 


ZADZ'W)m = a4 ff a5 dt O*6, Dee) aBl6, # 
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= a || as ator, 728,0(6, mab*(t, m)BCE, B)_ 


4 if / ds dt b*(s, j) 6s — A) 8(t, k) 





= =2"B(j, m)oB(m, k) + BG, b), 
0 
where B(j, m) = Jf ds b*(s, j)b(s, m) is a cross-correlation matrix. 
Hence, 


L"(m) = (En/H;)BG, m)oB(m, k) + (No/2E,)BUG, k). 


Similarly, it is found that (ZZ),, is the 2P X 2P zero matrix. 
For future computations, it is convenient to write 


a= (0 gt): 


Qr = BU aT (N,/2E,)[Bo] *}"’, 
Q” = —(E,/E,)B(2){(B2/By)I + (No/2E,)[B(2)o]*}™. 
2.3 The:Characteristic-Function Method 


To obtain the distribution, consider the conditional characteristic 
function 


where 


gn(t) = (exp (itZ'QZ))m « 
It is well known, e.g., Ref. 4, that 


. nll) = det’ J — itL(mQ] = J] fl — 7t(my", 


where {d,(m)} is the set of eigenvalues of the matrix L(m)Q. The 
eigenvalues are real, since L(m)Q is similar to the Hermitian matrix 
L?(m)QL?(m). 

The distribution function can now be obtained from the characteristic 
function. As a preliminary, it is noted that the characteristic function 
(1 — 7t\)”” corresponds to one of two distribution functions, according 
to the sign of A. When ) is positive, the distribution function is 


- U(x) x” exp (=2/\) = ed -l1) Yw>9d), 
I ee 0 (y < 0), 
= Uy)Iy/r, n — 1), 
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where U(z) is the unit step function (unity for x > 0, zero for x < 0, 
one-half for x = 0) and where 
_ L y oa a n y" 
Iy,n) = 3) [de a’e =1l-—-—e DS ii 
is the incomplete gamma function. Similarly, when } is negative, the 
distribution function is 


i 2) oe s mi O 

1 (y > 0), 
Say Mee ASO), 

= 1— U(—yIy/r, n — 1)- 


To obtain the distribution function of Z'QZ, the characteristic func- 
tion is expanded into its partial fractions. Each term will be propor- 
tional to (1 — 7t\)~” for some n, and corresponds to a term in the expan- 
sion of the distribution function. For example, when all eigenvalues 
are distinct, the expansion of the characteristic function is 


k 


-_ d,(m) 
en(l) = 2 1 — itd,(m)’ 
where 
at, . dtm) 
dx(m) = I] ( di(m)/  * 
The expansion of the distribution function F,,(x) is then 


D2 d(m)U(@)I(y/(m), 0) 


{k:\E(m) >0} 


d,(m)[1 — U(—2x)I(y/d.(m), 0] 
{k :\n(m) <0} 
In the case of a degenerate spectrum, an eigenvalue \ with multiplicity 
r contributes the sum >+7_, A,(1 — 7f\)~" to the expansion of the 
characteristic function, and the corresponding part of the distribution 
function involves [(-, n) forn = 0, 1, 2, ---,7r — 1. 

It should be observed that the general approach of summing distribu- 
tion functions corresponding to partial fractions is fully equivalent to 
inverting the characteristic function by contour integration, the ap- 
proach used by Turin’ for a similar problem. (When all poles are simple, 
the expansion coefficients {d,(m)} are residues of the poles.) 
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III. UPPER AND LOWER BOUNDS ON THE ERROR PROBABILITY 


3.1 Error-probability Bounds from Degenerate-spectrum Variables 


Exact computation of error probability involves considerable nu- 
merical work in computing eigenvalues followed by evaluation of cum- 
bersome formulas. Moreover, an often inordinately large number of 
independent parameters must be specified. To simplify this situation, 
we consider bounds on the spectrum of L(m)Q rather than the spectrum 
itself. With a technique suggested in Ref. 6, we can obtain error- 
probability bounds. Although we do not obtain the error probability 
itself, the error-probability bounds apply to not just one channel but 
rather to any channel for which the spectral bounds are met. 

Observe that the characteristic function is precisely specified by 
the spectrum of L(m)Q. This spectrum is the same as the spectrum 
of I diag [\,(m), --- , \ep(m)], where J plays the role of a covariance 
matrix and the diagonal matrix plays the role of a matrix of a Hermitian 
form. Hence, the distribution of Z'QZ is the same as the distribution of 


gm) = > d,(m) | 2 ee 


where {z,} are complex zero-mean Gaussian variates with covariance 
matrix (2;2%) = 6;,, (2;2.) being zero. 

Suppose bounds on the eigenvalues are available. That is to say, 
suppose it is known that the positive eigenvalues satisfy 


w= M(m) Sa, (1a) 


and that the negative eigenvalues satisfy 
—7 Sd (m) < —», (1b) 


where the y’s and »’s are positive numbers that depend on m. Then, 
a lower bound on q(m) is the degenerate-spectrum random variable 
q(m), defined by 


P 2P 
gm) =u Dla P—» DI laf. 
k=1 k=P+1 
Note that we have used the fact that the number of positive eigenvalues 
and the number of negative eigenvalues are the same, see Appendix A. 
Similarly, an upper bound on q(m) is provided by the random variable 
2P 


Gm) =H Qo le (ee | 2, |’. 


=P+ 
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Since g(m) S q(m) S g(m), it follows that 


< 
Pr {G(m) S y} S F,.(y) = Pr {q(m) S y} S Pr {g(m) S y}. 


Evaluation of these bounds requires the distribution function 
G(y; P, a) of the degenerate-spectrum random variable 


P 2P 
Dee es De eel 
k=1 k=P+1 


which is the difference of two chi-square variables each with an even 
number of degrees of freedom. The bounds become 


G{(@)*y; P, (@)"'] S F,.Cy) S Gl) y; P, aw), 
where we use y = (N,/2E)6 and reiterate that the u’s and v’s depend 
on m. 
It is anticipated that these bounds are sharp when the spectrum 
is narrow, the spread of the positive spectrum being much less than 
any positive eigenvalue and similarly for the negative spectrum. Also, 


when @ itself is not precisely known, but bounds @ S 6 S @ are available, 
the distribution function is bounded by 


G{@y; P, e@"") S Fy) S Gl(w)"g9; P, ww), (2) 
where y = (N,/2E)@ and g = (N,/2E)6. 
3.2 Distribution of a Degenerate-Spectrum Variable 
It will be demonstrated that G(y; P, a) equals 


( a)" > @ - : ‘i Vs es (Lu ops ye r) | (3a) 


when y < 0, and equals 


S 4 7 : = NG illtes +a wet I(y,P -1—- | (3b) 


when y > 0. 

Before doing so, note that when y < 0, the parameter a serves as 
a scale size for y in the argument of (zx, n), but that this is not true 
when y > 0. Nevertheless, a does act as a scale size in the following 
way. A power-series expansion of I(x, 1) yields 








Iy,P -—1-—-— 4) 


- ( s—) (8) ot (P — k)! os " Saar y 


_ a 
(1 + a)” 
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and when a < 1, the factor (y/a)”~* determines the small-y behavior. 
Also, this result exhibits [a/(1 ++ a)]” as a factor for the case y > 0, 
in agreement with the expression for y < 0. 

To find G(y; P, «), we consider its characteristic function 


(1 — it)-?(1 + tte)”. 
Let the partial-fraction expansion of this characteristic function be 
P-1 P-1 
Ags Sa ee Beet ate) 
m=0 n=0 
To evaluate Ap_,,, multiply by (1 — zt)” and let 1 — zt = 7 to obtain 
P-1 P-1 
(l+a—ar)” = D> Apent” + 7” > Bp. +a —ary?™. 
m=0 n=0 


Since the second sum is analytic at r = 0, we have exhibited the Taylor 
expansion with remainder. But 





(l-+e—ar)? =(14+ 0)*(1 a zs 1) 


~(ea) EC TE Vee 
~M+a & k l+a es 
where we have used (7) on page 2 of Ref. 7. Hence, 
Anne (CTR Nee) 
Pom "MA ta m L+a/° 
Similarly, to obtain Bp_,,, multiply by (1 + tte)” and let 1 + tta = 7 
to obtain 


1 _ —-P P~1 Pot 1 - -(P-m) 
(1 +2_3) = DU Bp-»t” + 7” Be ee +i-2) 
n=0 m=0 








Qa a 


Reasoning as before, it is seen that 


Brow = (55) (P72 Yet) 
Pra Mi te n 1+a/’ 


Collecting these results, it is seen that the characteristic function is 


ETE ee eyo tere 


+ 








a *4\—(P—-k) 
2 gg] 
Gta ( ) 

This immediately establishes the distribution function G(y; P, a). 
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IV. WIDELY ORTHOGONAL SIGNALS 


4.1 Matrices for the Two Hypotheses 


We consider the special case in which the signals are widely orthogonal, ' 
Bd, 2) = B(2, 1) = Opyp. That is to say, all time-doppler shifted 
versions of one signal are orthogonal to all such versions of the other 
signal, a situation that would prevail in frequency-shift. keying with 
widely separated frequencies. In this case, 


ik — En ° No ; i} 
L'*(m) = 3.) Be BG, m)oB(m, k) + OF, Bj, k) 


The ‘diagonal’ form of the covariance matrix L(m) and of the matrix Q 
implies that the spectrum of L(m)Q comprises the specturm of L**(m)Q™ 
together with the spectrum of L”’(m)Q”. This can be seen by employing 
the formulas of Schur (Ref. 8, pp. 45-46) to reduce the determinantal 
equation det [Z(m)Q — iJ] = 0 from order 2P to order P. For m = 1, 


L"()Q" = Bile, 
L”(1)Q” = —(No/2E,)(E2/E,) {(E2/E,)I + (No/2E,)[B(2)oJ "7". 
For m = 2, 
L*(2)Q" 
L*2)Q" 


(No/2E,) {I + (No/2E,)[B()o} "3", 
—(E,/E,)B(2)c. 


Il 


It should be observed that the spectra of the above matrices are 
simply related to the spectra of B(1)o and of B(2)c. When £, = EH, = E, 
the spectrum of L’*(1)Q” is {—(N,/2E)(1 + (N,/2E)6,')~*}, where 
{6,} is the spectrum of B(2)c. Similarly, the spectrum of L**(2)Q” 
is {(N,/2E)(1 + (N,/2E)w;,')~*}, where {w,} is the spectrum of B(1)c. 

Second, it should be observed that when #, = EL, = E, the forms 
of the matrices for the cases m = 1 and m = 2 are the same, with the 
roles of positive and negative matrices interchanged. To compute 
error probability for m = 1, we use the distribution function F(z); 
for m = 2, we use the conjugate distribution 1 — F(z) which can be 
expressed as P{—Z'QZ < —-x | 2}, the distribution function of the 
negative of the original variable evaluated at —zx. Introduction of this 
random variable for the case m = 2 reverses the roles of positive and 
negative matrices, the net effect being that for both m = 1 and m = 2 
the positive and negative matrices have the same forms. a 
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4.2 Bounds on Spectra, 6, and Error Probability 


It is clear that spectral bounds on L(1)Q can be obtained from spectral 
bounds on B(m)c, m = 1, 2, and similarly for £(2)Q. Consider the 
bounds on L(1)Q when FE, = E, = E. The positive spectrum is bounded 
as follows: 


#=o Sminw, S (1) S maxwy, So = G, 


and the negative spectrum is bounded as follows: 
. —p = —(No/2E)[1 + (No/2E)(8) "TY * S A) 
A(1) S —(N/2E)[1 + (N./2E)(8) "JS" = —», 


where § < min 6, S max 6, XS 6. 


Moreover, bounds on @ can also be obtained. When #, = EF, = E 
and a, = a, = 3} (equilikely signals), 
det [Bo + (No/2E)I)_ 
det [B(2)o + (N,/2E)I} 
Since a determinant is the product of the eigenvalues of the matrix, 
we have 


y= (No /2E) 6 i (N,/2E) log 


(N/2E)6 = (N,/2E) log TS ony 


Thus, an upper bound is 


+ (N./2E) 


(No/2B)8 = (N./2E)P log Far ; 


and a lower bound is 


@ + (No/2B)_ 
5+ (N./2E) 


Recall that the distribution function F',[(No/2E)6] is bounded from 
above by G[(u)""(N0/2E)8; P, ¥(u)~"]. Further, suppose that the 
spectra of B(1)o and B(2)o are narrow about the nominal value 
(1/P) tr B(m)o = (1/P) tr o = (1/P). We can put 
1 1 — 


P? o=s=-— >, 


: i 


(N,/2E)@ = (N,/2E)P log 


o= 65 


where 6 is the fractional spectral half width. Then, the parameters 
required to compute the upper bound on the distribution function are 
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1 +6 + (N.P/2E) 
16+ (N,P/2m) > ©) 





(u)*(No/2H)5 = = (NoP/2E)P log 


(ie 1 wp /20)| 1 spices (vp /28) | (5b) 


re 


Similarly, the distribution function F',[(N,/2E)6] is bounded from below 
by G[()~"(No/2E)9; P, v(a)~"]. The parameters required for this bound 
are 


Wi eas = 6+ (MP/2E) ¢ 
(a) (No/2B)8 = 7 (NoP/20)P tog FE ping » (5e) 





-1 
Po ae i 
¥@) = T4B i: 3 (No Pai] 1+; + ze _ wwP/28) | . (5d) 
Having considered the case m = 1, the bounds for the case m = 2 
are apparent. Considering the random variable —Z'QZ with 6 assumed 
known, the positive and negative spectral bounds are precisely the 
same as for the case m = 1, and the upper bound is 


G[—(u)"(No/2H) 8; P, Hu) ~] 


whereas the lower bound is G[—(f)'(N./2E)6; P, v(m)". But @ is 
unknown, and the upper bound is given by replacing — 6 by 6, and the 
same result is obtained as previously; similarly, the lower bound is 
given by replacing —86 by @. In short, the bounds apply to both cases, 
m = 1 and 2. 

The numerical values of these bounds are given in Figs. 1 to 3 as 
functions of 2H/N»P (the signal-to-noise ratio per path) for various 
fixed values of 8 (the fractional spectral half-width) and P (the number 
of paths). The curves are nested with respect to values of the fractional 
spectral half-width 8; an increase of 6 always yields an increase of the 
upper bound and a decrease of the lower bound. A measure of the sharp- 
ness of the bounds (given a nominal value of error probability P,) 
is provided by the difference of the upper-bound and lower-bound 
values of 2H/N,P (in dB) for given values of 8 and P. For P, = 10* 
and P = 4, the sharpness is 11 dB for 6 = 0.05 and 22 dB for 8 = 
This measure of sharpness appears to be relatively insensitive to the 
value of P. An alternate measure would be the difference in error 
probability for a given value of 2H/N,P, and this measure is indeed 
markedly sensitive to P. 

In the region of the curves corresponding to high signal-to-noise 
ratio, there is an improvement in error probability associated with 
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Fig. 1 — Error-probability bounds for widely-orthogonal signaling, P = 2. 


larger P; the curves become straight lines since P, becomes proportional 
to (2E/N,P)~*. However, this improvement is in part attributable 
to choosing 2H/N,P, the average per-path signal-to-noise ratio, as 
the abscissa rather than 2H/N,, the total signal-to-noise ratio. To 
obtain plots vs 2H/N,, one moves the P = 2” curves to the right 
by 3n dB; then, the improvement with increased P is less dramatic 
in this region of high signal-to-noise ratio. 


4.3 Computing Spectral Bounds 


It has been observed that bounds on the error probability for the 
case of widely orthogonal signals can be obtained from bounds on the 
spectrum of B(m)c, m = 1, 2. We now give several easily computed 
formulas for these bounds. 

Recall that B(m) is defined to be f dt b*(é, m)6(t, m), a matrix of 
scalar products or a Gram matrix. In general, this is uninformative, 
since a matrix is a Gram matrix if and only if the matrix is positive 
semidefinite. However, we will shortly use the fact that in our case 
the diagonal entries of B(m) are unity because of the normalization. 
Next, note that B(m)c is similar to o’B(m)c’, a hermitian matrix 
which has real roots (since ¢o is a real diagonal matrix with positive 
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entries, the matrices o? and c™? exist; then o[B(m)o]o? = o*B(m)o’). 
When B(m) is diagonal or nearly so, the roots of B(m)c should be 
close to the entries of o; this is justified by the following theorem.’ 
The characteristic roots of any matrix A lie in the closed region of 
the z-plane consisting of all the disks {z:| 2 — Aic] S Doe: | Aas |, 
a = 1, 2, --- , P}. In our case, the region must be on the real line, 
and we obtain a set of not necessarily nonoverlapping intervals centered 
about {o;}, the half-widths being {>>;.; | B,;(m) | o;} when we take 
A = B(m)c. The spectral bounds are then the rightmost right-end point 
max [Ais + bs [Aa [], 


fi 
and the leftmost left-end point 
min [A;; — oe [Ai |] 
4 jA#t 


(when it is positive). 

A family of spectral bounds is obtainable from this theorem by apply- 
ing it to B(m)o and to matrices similar to B(m)c, e.g., ¢°B(m)o’?, «B(m), 
and more generally o*B(m)c’"*, 0 S a S 1. Thus, we have the family 
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Fig. 2— Error-probability bounds for widely-orthogonal signaling, P = 4. 
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Fig. 3 — Error-probability bounds for widely-orthogonal signaling, P = 8. 


of upper spectral bounds 
max {o; + >> of | B,,(m) | 077}, O<a 
4 7At 


IIA 


1. (6) 


The question arises: which is the smallest upper bound? It is not true 
in general that a bound is attained for the value of 7 that maximizes 
a; , but suppose this is the case when a = 0. That is to say, suppose 
og; = max, o, and that 


a Bs Dy | B;;(m) | = ee al ale pa | Bi sm) at 


Then it follows that this is the smallest bound in the family, for ¢;/o; S 1 
implies that 


l-«@ 


De | B;;(m) i Ss 2 | B;;(m) | (2) ’ 


TAt 7#i 


and hence 


a1 a2 ye | B;;(m) | 


IIA 


oft + 2 | Bes(m) | (2) 


a“ tel + 2, | Bus(m) | (z)"* 


IIA 
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Similarly, we have the family of lower spectral bounds 


Tt 


min fo, — do of | By,(m) | 037%}, 0OSa Sl. (7) 


The largest lower bound is obtained when a = 1 provided that o; = 
min, o;, and that 


a ae p> | B.;Qn) |] = min {oll = 2 | B.j(m) ee 


To see this, observe that o;/c; = 1 implies 


> | B;;(m) | (:)* = 2: | B;;(m) lj 


rt 


and hence 


o,{1 — » | B,;(m) |] = x) 1 _ > | B,,(m) | ("| 


web Z1000 1)" 


It should be noted that less sharp bounds are easily obtained. For 
example, the matrix «B(m) yields the upper bound 


max {o,{1 + » | B:;(m) |]} S max o; max [1 + pa | By;(m) |], 


IV 


IV 


and the right-hand side is easily computed. The corresponding lower 
bound is 


min {o[1 — | Ba(m) []} & min of. — max Y | Bss(m) (I. 


These less sharp bounds are easier to compute than those obtained in 
a similar fashion from B(m)o or from o*B(m)o* *. 

Also, it should be noted that sharper bounds can be obtained by 
employing a sharper theorem of matrix theory:’ The characteristic 
roots of any matrix A lie in the closed region of the z-plane consisting 
of all the ovals | z — Ais || 2— Asi | S Qoew: At) Qcee; Ana), t Fj. 
We do not pursue these bounds, but note that simple formulas are 
obtained only when all paths have equal strength, o; = 1/P. 

It is now clear that when B(m) is essentially diagonal, with 
borer | B;;(m) | <1 for all 7, the path gains o; are good nominal values 
for the characteristic roots of B(m)c. If, moreover, these path gains 
are equal, or approximately equal, then the upper and lower spectral 
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bounds are close to one another. When this narrow-spectrum condition 
prevails, the positive and negative portions of the spectrum of L(m)Q 
are also narrow, and the bounds on error probability are sharp. 


V. OTHER SPECIAL CASES 


5.1 Well-resolved Paths and the Theory of Diversity 


We consider the case in which the signals are resolvable, B(1) = 
B(2) = I, ie, the paths are well separated in time and frequency 
so that any time-Doppler shifted version of a signal is orthogonal to 
any other version of itself. Moreover, we also assume that B(1, 2) = 
J dt b*(, 1)b(¢, 2) becomes a diagonal matrix, B(1, 2) = pI where 
p = f dt x*()zx,(t), ie, the paths are sufficiently separated so that 
any version of one signal is orthogonal to all but the same-path version 
of the other signal. 

It is then easily seen that the covariance matrix is comprised of 
diagonal submatrices. For m = 1, 


LL’) = o + (N,/2E,)1 LY”) = plo + (N,/2E)T] 
L'(1) = p*[o + (No/2H,)T] = L?(1) = | p Po + (0/28). 
For m = 2, assuming /, > 0, 
LM(2) = (B2/E,)[| p Po + (No/2H2)1] 
L"(2) = p(E/E,)[o + (No/2E,)I] 
L"(2) = p*(iz/H,)[o + (No/2E2)I] 
L”(2) = (E,/E,)[o + (N./2E,)I]. 
Moreover, the matrix Q is diagonal, being related to 
QE,,/No)H(m) = (2En/No)[QEn/No)I + o')* = ola + (No/2E,,) I]. 


It then follows that L(m)Q is comprised of diagonal submatrices. 
To find the spectrum, the order of the determinantal equation can be 
reduced from 2P to P. Then the argument of the determinant is quad- 
ratic in \. For the case E, = EF, , a method of Turin [(22)—(23) in Ref. 5] 
can be used relating the \, to the eigenvalues (elements) of c. 

The above example brings the present analysis in contact with the 
theory of diversity combining, see e.g., Ref. 3, Sec. 7.4. Turin,’ for 
example, considered the case in which separate waveforms are available 
and the fading is nonindependent in general. In our analysis, only one 
waveform is in general available. But in the case of well-separated paths, 
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we may assume P separate signal waveforms have been observed. 
However, these separate waveforms must fade independently in keeping 
with our general discrete-path model, and the on-diagonal component 
matrices of L(m), viz, L'(m) = (N./2E,)(Z,Z1), and L”(m) = 
(No/2E,)(Z2Z3), , ave themselves diagonal matrices. It is still entirely 
possible that L’’(m), the off-diagonal component matrix of L(m), is 
not a diagonal matrix; e.g., when 2x2(¢) is a delayed version of 2,(t), 
then time overlap may preclude B(1, 2) being diagonal even though 
B(1) and B(2) are diagonal. But when we assume that B(1, 2) is also 
diagonal, then we obtain the form for Z(m) exhibited above. It can be 
observed that this is precisely the result Turin obtained for the case 
of optimum diversity combining, where his not necessarily diagonal A 
becomes our diagaonal co. When B(1, 2) is not diagonal, then our 
results do not specialize to the form given by Turin, a reflection of 
the fact that the multipath channel is not in general fully equivalent 
to a diversity channel. 


5.2 On-Off Keying 
Another example is the case of on-off keying in which F, = 0. The 
test statistic Z'QZ becomes 


[(No/2E,)'Z(1)) E/N )\H()[(No/2E,)'Z()], since Q* = 0. 


Thus, the distribution is determined by the spectrum of the matrix 
L™(m)Q", where 


Lm) = 6n1B, m)oB(m, 1) + (No/2E)B(1), 
QU = BU0){I + W/2H,) [Boy *}™. 


Observe that we no longer have the difference of positive-definite 
forms, the test statistic now beinga positive random variable. The 
threshold (N./2E,)6(2) is 


ll 


(N,/2E,) log det | eae 4 r| 


which is positive since the eigenvalues of (2H,/N,.)B(1)o + J are 


greater than unity. 
Assuming that the spectrum of L"'(m)(2E,/N,)H(1) lies in the 
interval (u, #), where uw and 7 are functions of m, the bounds on the dis- 





tribution function are 


Gu reve/2E)0;P, 0} = Fal (N2)o | = ctuyr'av./24)0;P, o 
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Recall that G(x; P, 0) is related to the incomplete gamma function, 
G(a;P,0) = I(a,P — 1). 


The spectral bounds must exhibit two forms of (2,/N,)-dependence. 
When m = 1, L"(1)Q™ = B(1)c, and bounds on B(1)o become u and g. 
When m = 2, L**(2)Q™ = (N,/2E,) {I + (No/2E,)[B(1)o]~*}~’, so that 


= (N/2E,)[1 + (No/2E,)(@) "J", 


I 


Tk 


‘| 


= (N/2E,)(1 af (N/2E,)(@)"J}"," 


where the spectrum of B(1)c is confined to (@, @). 
Collecting our results, when m = 1, 


F [WN 0/2E) 6] S I{(w)"(No/2H)P log [2H/N.)e + 1];P — 1} 
F[Wo/2E) 6] 2 1{(@)"(No/2E)P log [2E/No)w + 1];P — 1}. 
Similarly, when m = 2 
F,[(N0/2E) 6] S I{[1 + (No/2E)(e)"]P log [2E/N.)e + 1];P — 1} 
F,[(No/2E) 6] 2 If [1 + (No/2B)@) "JP log (2B/No)e + 1];P — 1}. 


These results permit the computation of error-probability-bound curves 
that would be universal in the same sense as the curves for widely- 
orthogonal signaling, i.e., the curves would apply to any element of 
the set of channels for which the spectral bounds are met. 


VI. CHERNOFF BOUNDS 


6.1 General Case 


Up to this point, consideration of spectral bounds has lead to error- 
probability bounds which are sharp when the spectrum comprises 
narrow positive and negative portions. These bounds are easy to employ 
when B(1, 2) = 0 and B(1), B(2) are nearly diagonal matrices. But 
in more general cases, the estimation of spectral bounds may be difficult 
and bounds may be poor approximations of eigenvalues. We turn to 
another technique of bounding error probability which does not ex- 
plicitly require spectral bounds. 

Consider the error probability when hypothesis m = 2 is true, 
P,(2) = Pr {Z'QZ > (N./2E)6 | 2}. Recall that the unit step function 
U(x) is unity for x > 0, zero for x < 0, and one-half for = 0. Then 
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P.(2) = Pr {U[Z'QZ — (N,/2E)6] = 1 | 2} 
& {U[Z'QZ ~ (N./2E) 6}, 


where & denotes expectation under hypothesis m = 2. But since 
U(x) S exp (ux) for any uw. > 0, we have 
P.(2) S & {exp w2[Z'QZ — (No/2E)6]}. 

This average can readily be computed, since Z'QZ has the same dis- 
tribution as >> ),(2) | 2 |?, where {d,(2)} is the spectrum of L(2)Q. 
Since & = 0, &;z* = 6;,, and &2,2, = 0, the Gaussian variables 
{Re z;}, {Im z;} are independent with zero mean and variance equal 
to 4. Thus, P,(2) is bounded from above by 


exp [—y2(N/2E) al I & exp (uoAz(2) | Re 2; | ) 


where the outer square appears because the product involving {Im 2,} 
has been suppressed. But a standard calculation shows, 


& exp (u2d2(2) | Rez |?) = [1 — med(2)J?, when wd.(2) <1, 
and our bound is 
2P 
exp [—y2(N,/2E) 6] Il [1 — ped.(2)]. 
Thus, 


P,(2) S exp [—u.(No/2E) 6] det™' [I — ueL(2)Q], (8) 
which holds for all 2 such that 0 < pe < [max A, (2) ]7. 

The above procedure is adopted from the technique due to Cher- 
noff (see Ref. 3, Sec. 2.5 and 7.4). Here, we do not have identically 
distributed variables; indeed, half are positive and half are negative 
random variables. 

To find the best value of ye, we write the bound as 


N 2P 
exp (bs yn 6 — In I] [es wd} 


and differentiate the argument of the exponential. A necessary con- 
dition for an extremum is that the derivative be zero, and this yields 


2P x (2) 
N./2E)@ = >, —"# 
( of ) k=1 eS b2d;(2) 
2P 1 


= tr {((ZQ)Q)* — wT]"}. 


-1 
kai Ax(2) — Me 
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If the value of pe that satisfies this equation lies within the allow- 
able interval [0, max7d;,,(2)], then this value of pe minimizes the 
upper bound. A minimum occurs because the second derivative of the 
argument of the exponential is positive, being 


> ] (2) i 
Fer = Bod,(2) ; 
In a similar fashion, the error probability for m = 1 can be over- 


bounded. 
PAI) 


Pr {—Z'QZ > —(N,/2E)6 | 1} 
&, {exp u[—Z'QZ + (No/2E)6]} 
P.(1) S exp [u.(N0/2H) 6] det™ [I + .L(1)Q]. 
The best value of 1 satisfies 
(N,/2h)@ = tr {LOU + mLW)Qy'}, 


provided this value lies in the allowable interval [0, max(—2,(1))]. 


IIA 


6.2 Widely-orthogonal Signals 

Consider the case in which the signals are widely orthogonal, 
B(1,2) = 0, but have equal energy, ZH, = EH, = E, and are equilikely, 
a, = a = 4. The overbound on P, (1) is obtained from the spectrum 
of L(1)Q which comprises the spectrum of Z1? (1)Q™ together with 
the spectrum of L?*(1)Q??. Thus, 
P.(1) S exp [u.(V0/22) 6] det™* (I + wb") Q"] det™ (+ mb°()Q”"]. 
But the matrices used here were related in Paragraph 4.1 to B(1)o and 
B(2)e, and our bound becomes 
exp [u:(N,/2E) 6] det™* [J + »,B(1)o] 

‘det’ {I — p,(N,/2E)[I + (No/2E)(BQ)o)"J""}. 


After some manipulation, this bound becomes 


(EY fool (8) I 
det | Boo ae (Xe) r| 


infin Goo. Broa 0) ome + Gi} 
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where 





: — Jdet [Bo + (N,/2E)I] uy (No/2E) 
exp [ui(N./2E) 6] = ie [B(2)o + NolPED} ; 


The maximum allowable value of »; is determined by the largest 
eigenvalue of L?*(1)Q?* which in turn is determined by the largest 
eigenvalue of B(2)c: 


2B 
No 


where {8} is the spectrum of B(2). 
The best value of »; is found from the relation 


0< nm <> + max” (6), 


(1) 
(No/2B)8 = a 
tr {LY (OV + wL"e"y} 
+ tr {L?(DQ"T + mL?De"y 4}, 


where we again have exploited the decomposition of the spectrum of 
L(1)Q. After some manipulation, we find 


(N/2h)0 = tr [BO)o[T + w,BU)o}"} 


—tr {BQ 7 + (32 — 1)B@¢ |. 


An approximate solution can be obtained for the case of high signal- 
to-noise ratio. Let u, = £,(2H/N,); the relation becomes 


Wr x 
(Mo/2)0 = 2) on /Nomm, ~~ 1+ CHING — mya 


Suppose 7,(2H/N,)w, >> 1 and (1 — @,)(2H/N.)6, >> 1. Then the 
right side becomes approximately 








en sig 
(2E/No)m (24 /No)(1 — ih) 


Equating this to (V,./2E)@ and solving the resulting quadratic for the 
root applicable for the case 6 = 0 yields 


n= [b48) +(e): 
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When 6/2P is small, this value of 7, is approximately 


1 6 
5 [t-ap: 


and the corresponding value of yp, is (H/N,)[1 — (6/P)] which is ap- 
proximately at the midpoint of the allowable interval. 
In a similar fashion, the overbound on P,(2) is 


exp [—2(No/2E) 6] det* [I — poL'"(2)Q"] det™* [I — peL7*(2)Q”), 
which becomes 
exp [—w.(No/2E) 6] det* {I — u.(No/2E)[I + (No/2E)(B()o) "J" } 
det™* (J + H2B(2)o), 


or 


ox) {exp (-» aE ) 
| det | BA). + (Xs) r| 
det 2 E agi (x2) B(a)o # (Xe ) 1} act | 2 (¥) B(Q)o + (¥2) r| | 


where 





sau deinaeeca (ie [BQ)o + papel 


The maximum allowable value of y2 is determined by the largest eigen- 
value of L*'(2)Q” in turn determined by the largest eigenvalue of 
Bie: 
0 < Be < ae + max" (wx); 
0 


where {w,} is the spectrum of B(1)c. The best value of yu, satisfies 
(No/2H)0 = tr {LY (2)Q" (I — web" (2)Q")} 
+ tr { L77(2)Q”*(I _ ml Og 
(N./2B)@ = tr {Ba)e| (22 = 12)B(Q)o if a \ 


— tr {B(2)o[I + u,B(2)oJ"*}. 
Let w.=fi2(2E/N,) and suppose f2(2H/No)6, >> 1, (1~fe) (2H /No)u, > 1. 
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Then the right side becomes 


ERA ee RC ORS La 
(2E/N,) ae Bz) (2E/No)itz , 


and the approximation of the best value of 7, is 


eee | eee a al i 
Pa = (1 $)+(1+4) | 


which is approximately 3{[1 — (@/4P)] when (6/2P) < 1. 

The foregoing results can be specialized to the case in which the 
paths are resolvable, B(1) = B(2) = JI. Then 6 = 0, and it is easily 
seen that the best value of z,, is +. Both overbounds become 


p det [o + (N0/2E)I] 
det’ [o + (N./E)I]’ 


and this agrees with equation 7.134 in Ref. 3. 

It should be noted that z,, = 4 is always an allowed value of Z,, . 
Tor the case of resolvable paths, it is the best value, and whenever 
6/P «1 and 2E/N, is sufficiently large, it is close to the best value. 
Using Z,, = 3, we can obtain an overbound for both error probabilities, 
i.e., for P.(m), m = 1, 2. This overbound is 


det [B(38 — m)o + (N,/2E)I] 
det [B()o + (N./E)I] det [B2)o + (No/E)I] 
(9a) 


The factor exp (3 | @ |) can also be written in terms of determinants. 
When det [B(1)c + (N./2E)/] is larger than det [B(2)c + (N./2E)I], 
we have 


(E/2N o) 


(/2No) ” exp (2 | @ |) 


exp (3 


ne sc [B(1)o + w/a (ob) 
det [B(2)o + (N./2E)I]) ’ 
and when the reverse inequality holds, exp(3' | 6 |) is the reciprocal 
of the above. 
For the case in which the spectrum of B(m)o lies in the interval 
(1—8/P, 1+8/P) where 8 < 1, the overbound can be further over- 
bounded. The factor involving determinants is less than 


fof: +04 CP] 
[:-5+2@2)]) 
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and | @|/2 is less than 


Pg Lt B+ (NoP/2E). 
2°81 — B+ (NOP/2B) 


It follows that the Chernoff bound is less than 


bie 4{1 + 8 + (NoP/2E)}} - 
Canam ti B+ WPM — B+ NPS] on 
Numerical values of this bound are given in Fig. 4, and it has the 
same general character as the spectral-related bounds. Rather than 
sharpness given a nominal value of error probability P,, we consider 
the sensitivity measured by the change in 24/NoP (in dB) vs 8B; 
for P, = 10-* and P = 4, the sensitivity is 2 dB for 8 = 0.1. The 
sensitivity does not markedly increase with an increase in P, in agrec- 
ment with the behavior of the sharpness of the previous bounds. 
Comparison of the Chernoff bound with the previous bounds is 
conveniently done for the case B = 0 (cf. Sec. 7.4 of Ref. 3). The 
Chernoff bound does not specify a signal-to-noise ratio (required to 
achieve a nominal P,) excessively greater than the previous value; 
for P = 4, less than 2.2 dB difference is observed. This excess does 
decrease with increasing P. Moreover, it is entirely conceivable that in 
a broad-spectrum case with a large number of paths, an exact value of 
the Chernoff bound would be better than the spectral-bound result. 
Of course, our inexact (overbounded) Chernoff bound is poor in the 
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Fig. 4— Overbounded Chernoff bounds for widely-orthogonal signaling. 
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broad-spectrum case, but a Chernoff bound using the proper values 
of the determinants should be good for two reasons. (2) Such a bound 
reflects the precise values of the eigenvalues of the matrix L(m)Q. 
(2) When P is large, the probability density function is bell shaped 
with the probability “mass” being concentrated near the mean and 
most of the tail mass being at the leading portion of the tail; then 
the tail mass can be weighted by the exponential function with little 
error. On the other hand, the spectral-bound approach suffers in the 
broad-band case since the spectral bounds are not meaningful approxi- 
mations of all the eigenvalues. 


VII. DISCUSSION 


Having observed that exact computation of error probability is 
cumbersome and depends upon an often inordinately large number of 
parameters, we considered error-probability bounds (2) that are uni- 
versal in the sense that they apply to any one of a set of channels 
satisfying spectral bounds (1). Our bounds employ (3), the distribu- 
tion function of the difference of chi-square variables. For the special 
case of widely orthogonal signals, we obtained bounds employing 
parameters (5) in terms of the spectral width B, see (4), of the 
matrices B(m)o. Plots of these bounds showed that sharpness meas- 
ured in dB change of 2E/NoP with respect to 6 for a fixed value of 
error probability is not sensitive to the value of P. We presented a 
technique for obtaining spectral bounds for B(m)o when it is nearly 
diagonal, representative results being (6) and (7). This technique can 
also be applied to L(m)Q for the more general case in which the 
signals are not widely orthogonal. 

The case of resolvable signals (B(m) = J) made contact with the 
theory of diversity; we found that for the multipath channel to be a 
diversity channel, B(1, 2) must also be a diagonal matrix. Of course, 
the previous results also were in contact with diversity theory. With 
B(1, 2) = 0 (a diagonal matrix) but B(m) not necessarily diagonal, 
our results generalize those of diversity theory in the following sense. 
The special case 8 = 0 corresponds to a diversity channel with equal 
link gains, but the general case 8 ~ 0 can arise in the nondiversity 
situation when the matrix B(m) is not diagonal. (If B(m) were 
diagonal, B(m) = I and the diversity case prevails.) 

We then turned to the Chernoff bound (8) which does not ex- 
plicitly employ spectral bounds. The overbounded form (10) for the 
case of widely orthogonal signals was poorer than the previous bound 
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when 8 = 0. Nevertheless, there is promise that in a broad-spectrum 
case, the original form (9) would be better than the spectral-related 
bounds. A further advantage is that once the determinants are eval- 
uated, perhaps on an electronic computer, the error-probability bound 
is immediately obtained. In contrast, the spectral-related bounds 
require a certain amount of computation involving incomplete gamma 
functions even after spectral bounds are obtained. 
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APPENDIX A 
Here we show that the number of positive eigenvalues of LQ 


% 


equals the number of negative eigenvalues.* Recall that Z is positive 
definite and that Q can be written in the partitioned form 


a= |° ae 
0 Q” 


where Q" and —Q” are positive definite. Clearly, the number of positive 
eigenvalues of Q equals the number of negative eigenvalues. We can 
construct a family of positive definite matrices L,, 0 S t S 1, such 
that ZL, = I, L, = L, and L, is continuous in ¢. For example, let L, = 
(1 — J + tL; LZ, has positive eigenvalues {(1 — ¢) + ty,}, where 
{y,} are the eigenvalues of Z. Now the eigenvalues of Z,Q are real, 
for L,Q is similar to the Hermitian matrix L}QL? = L7}(L,Q)L}, where 
L} and L;? exist since L, is positive definite, Moreover, the eigenvalues 
of L,Q are continuous in ¢, since LZ, is continuous in ¢t. But £,Q never 
has a zero eigenvalue, for L, is positive definite and (L,Q)"* = Q™*L7? 
always exists. Since the eigenvalues are real, continuous in ¢, and never 
zero, it follows that no positive eigenvalue of L,Q can become negative 
as t varies on [0, 1], and no negative eigenvalue of L,Q can become . 
positive. The conclusion is established. 


APPENDIX B 


This appendix presents another derivation of the distribution func- 
tion of >>? | z, |? — «302%, | 2, |?. This derivation makes contact with 


* We are indebted to B. H. Bharucha for the conception of this proof. 
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the special functions that have appeared in analyses of diversity chan- 
nels; also, this derivation appears to admit generalization to the case 
a = Re 2 with (¢;2,) = 6;,. (An odd number of variables in the real 
case corresponds to half-integer P in the complex case.) 

The density function of >+? | z, |? is 


P-1/- 
xe 


ta ay. | 0), 
P—1)! 
j@) =" — 2 
0 (x < 0), 
and the density function of —a ).2", | 2, |? is 
0 > 0), 
g(x) = oy ay 
(—2) ator a 
a SRT < 0). 
fea 3? 


The density of the sum is the convolution of the densities, 


ne) = [dy Hate — », 


where the first argument of max (- , -) arises from the truncated form 
of f and the second argument arises from the truncated form of g. 
It follows that 


For the case x > 0, the lower limit is x. For the case x < 0, the integral 
can be cast into the form of the integral for the case x > 0 by a change 
of variable. The result differs only in the exponential factor, i.e., 


i @rs se exp (—2) 
a (P —1)\(P — 1)! 


fe dy(y — |x |)*'y"™? exp |-(2 + ru | : x<0. 


The integral can be evaluated with the aid of relation (12) on page 202, 
Vol. II of Ref. 10, and the common result for the cases x < O and x > Ois 


P=} i. ) z| 
h(x) = je (tex | (S- 1) 5] K: (4 lel) 
Vradte"P-p!  “\a@ 2/7? 
where Kp_;(z) is the modified Bessel function of the third kind. 
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The above expression for the density is valid for all P, noninteger 
as well as integer. But in our application, P is an integer; a relation 
on page 80 of Ref. 11 yields 


as P-1 
— feo palit! 
Kp-4@) = Vz to KP — 1 — k) 122)" 


The density is then 


1 = Se woe) gba 


— (P—-1+h)! ( 1 i ( a ae 


When x < 0, the exponential becomes exp (#/a), and when x > 0, 
it becomes exp (—2). 

Observe that when a = 1, the density is symmetric. When a < 1, 
the factor exp [(1 — a/a)x/2] shifts the mass to the right. When a — 0, — 
it can be shown that h(a) — f(z). 

To obtain the distribution function G(y; P, a), consider first the 
case y < 0. Since {%,, dx h(x) equals Jf, dz h(—x), the following integral 
arises in each term of the sum, 
co P-1-k 

de --=re(#) =(P-—1-A)!1 — I(y \/e,P —1—-4)). 
lvl & ao 
The case y > 0 is treated by considering f°,, dx h(x) + f% dx h(x). 
The integral that arises is just (P — 1 — k)U(y, P — 1 — k). These 
steps establish our final result, quoted above. 

Our result could also have been obtained from the Fourier transform 
of the characteristic function (1 — 7)"7(1 + dta)~”. The Fourier 
transform of (a + it)~7*(8 — it)”?” is given by relation (12) on page 119, 
Vol. I of Ref. 10 in terms of Whittaker functions that reduce to Bessel 
functions for the case » = v = P/2 in view of relation (14) on page 265, 
Ref. 12. The density function can thus be obtained. 
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Precoding for Multiple-Speed 


Data Transmission 


By ROBERT W. CHANG 
(Manuscript received April 24, 1967) 


In certain applications, because of noise, compatibility, or other con- 
siderations, tt is desirable that a data transmission system have the flexibility 
to operate at multiple speeds. In this paper, a precoding scheme for multiple- 
speed digital or analog data transmission is presented. The scheme has a 
flexibility which allows the data rate and overall channel characteristics 
to be changed simultaneously by simply changing the data format and some 
resistive elements. There 1s no change in the filters, the equalization, the 
transmitter signaling interval, or the receiver sampling time. By using 
partial response channels, a number of commonly used data rates are 
easily obtained, using a physically realizable precoder and correlator. With 
correct timing and the use of orthonormal siginals, the signal-to-noise ratio 
is maximized at each data rate for bandlimited white noise under the con- 
straints of fixed line signal power and no intersymbol interference. Timing 
error 1s considered in a two-speed transmission scheme, and the selection of a 
precoding matrix using eye opening as the criterion ts studied. This study 
clearly demonstrates the advantage of changing the overall channel char- 
acteristics when changing the data rate. Eye openings obtained are equal to 
or larger than those of two conventional schemes transmitting at the same 
data rates. 


I. INTRODUCTION 


In conventional pulse amplitude modulation (PAM) data transmis- 
sion systems (digital or analog), the signal at the receiver input 
takes the form 


s(t) = a a,f(t — kT), (1) 


where {a,} are the information symbols, 7 is the signaling interval, 
and the signals f(t — kT9),k = 1, --- , N, are time translates of each 


1633 


1634 THE BELL SYSTEM TECHNICAL JOURNAL, SEPTEMBER 1967 


other. It is well known? that in order for these systems to meet the 
criterion “Maximize the signal-to-noise ratio in the presence of band- 
limited white noise under the constraints of fixed line signal power and 
no intersymbol interference,” the signals should be designed so that 
the overall channel characteristics are in the Nyquist I class and the 
overall amplitude characteristics are divided equally between the trans- 
mitting and the receiving side. Such a signal design scheme (hereafter 
referred to as Scheme I) is popular and is used even if the system de- 
signer is aware that the channel noise may not be white over the fre- 
quency band of interest. This is because the practical determination 
of the noise statistics and the realization of the corresponding optimum 
filters for a general communication complex are nearly impossible. A 
block diagram of Scheme IJ is shown in Fig. 1. 

In this paper, a precoding signaling scheme (Scheme II) is presented 
for multiple-speed analog or digital data transmission. Scheme II also 
meets the signal-to-noise ratio criterion above. The very distinctive 
difference between Schemes I and IT is that in I the signals f(t — kT) 
are time translates, but in II the signals are not necessarily so. This 
property allows the data rate and overall channel characteristics (such 
as represented by the eye opening) of Scheme II to be changed simul- 
taneously without changing the filters, the equalization, the signaling 
interval at the transmitter, or the sampling time at the receiver. 

In Scheme II, a sequence of information symbols is divided into 
blocks and the blocks are transmitted sequentially. For clarity, we 
first consider in Section II the transmission of a single block at a fixed 
data rate and the precoder and the receiver structure. Multiple block 
multispeed transmission and the use of partial response channels are 
considered in Section III. A two-speed transmission scheme, sampling 
time error, and eye patterns are considered in Section IV. 


Il. TRANSMISSION OF A SINGLE BLOCK AT A FIXED DATA RATE 


A block diagram of Scheme II is shown in Fig. 2. The quantities 
H (jo) and h(t) are, respectively, the transfer function and the impulse 


a 
{ n} TRANSMITTING RECEIVING DECISION 
FILTER FILTER CIRCUIT 


Fig. 1— Block diagram of Scheme I. 


















TRANSMISSION 
MEDIUM 





NOISE 
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{on} 
TRANSMISSION DECISION 
MEDIUM CIRCUIT 


Fig. 2— Block diagram of Scheme IT. 



















PRECODER CORRELATOR 


NOISE 


response of the transmission medium. We shall consider H (jw) to be 
bandlimited, and 


H(jw) #0, = |w | S 2nf 





(2) 
= 0, otherwise. 
The time interval 
i eoeead (3) 
ove seconds 


is the Nyquist interval. 


Consider the transmission of a block of symbols a,, --- , ay. Each 
symbol can be an m-ary digit (m = 2) or a real number. The precoder 
converts a, , -*+ , @y into a sequence of numbers b,, --: , by, and the 
number b,, k = 1, --- , N, is transmitted at ¢ = kT. This produces 
a signal at the input to the receiver given by 


N 
s(t) = >> bat — kT). (4) 
k=1 
From (2), the impulse responses h(t — kT) are infinitely linearly 


independent, i.e., 


N 
> b(t — kT) = 0 forallt=b, =0 forallk, (5) 
k=1 


where N can approach infinity. Equation (5) can be proven by noting 
that the equality 


N 
> bA(t — kT) = 0 for allt 
k=1 


and (2) together imply that 


N 
be 7***" =0 for |w| S Qrf, . (6) 


k=1 


Equation (6) then implies that b, = 0 for all k. 
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As is well known, a bandlimited signal, say g(t), can be represented 
by its time samples. The vector whose elements are the time samples of 
g(t) will be referred to as the time sample vector of g(é). For convenience, 
we shall use time sample vectors in discussing the precoder and receiver 
structures, and use the signals themselves in analyzing the overall 
channel characteristics. 

Let h, ,a 1 X 1 vector, be the time sample vector of h(t — kT), where 
the value of M will be considered later. Then (4) is equivalent to the 
vector equation 


N 
Ss = ds b,h, (7) 

=1 
The N vectors h,, k = 1, --- , N, are linearly independent since 
the impulse responses h(t — kT) are. Hence, the N vectors h,, k = 
1, --- , N, generate a real Euclidean vector space &y of N dimensions. 


If the precoder were not used, we would have b, = a, and S = >~™_, a;h, , 
and the information symbols a, would be transmitted as coordinates 
of the basis vectors h, of Sy. It is well known that the basis can be 
changed by a linear transformation. A precoder can be used for this pur- 
pose so that a suitable set of basis vectors can be chosen for each trans- 
mission rate of a multi-speed system based on considerations such as 
signal-to-noise ratio and the effect of timing error. 


Define 
ay b; hi Vi 
A=/:j, Be/:|, H=/:/, Wel: ]|, ® 
ie by hy Vi 
where V represents a set of basis vectors for &y and the prime notation 
represents transpose. Since h, , k = 1, --- , N, generate &y , Vis related 
to H by 
V =4HaA, (9) 


where A = [\,;] is an N X N nonsingular matrix. If a, is transmitted 
as a coordinate of V, , then 


N 
S= >0a,V, = VA = HAA. (10) 
k=1 


But, from (7) 
S = HB. (11) 
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From (10) and (11), the precoder structure is 
B = AA. (12) 


Since the noise statistics and the statistics of the customer’s data 
are usually unavailable, we choose here not to carry out a usual optimiza- 
tion study on the choice of V using such statistics. In the sequel, V is 
chosen to be a set of orthonormal basis vectors. This enables the pre- 
coding signaling scheme (Scheme JI) to meet the following requirements: 


(t) The performance is optimum in the same sense as the popular 
Scheme I described in Section I. 

(iz) The overall channel characteristics are controlled by the precoding 
matrix A and hence by resistive elements. (In Scheme I the overall 
channel characteristics are controlled by the transmitting and receiving 
filters.) 


These requirements are met with a simple receiver structure. The 
noisy signal at the input of the receiver is 


N 
X=S+N= DoaV,t+N, (13) 
k=1 
where N is the noise vector. A correlator at the receiver computes 
the decision statistics X’V, , X’V2, --- , X/Vy. Since V,, V2, -+> , Vy 
are orthonormal, we have . 
X’V,, = Ay + N’V, . (14) 


Because of orthonormality the decision statistic X’V, depends only 
on a, and there is no intersymbol interference. A decision on the symbol 
a, can be made from the decision statistic X’V, by a simple, standard 
decision rule. 

A basic difference between Schemes I and II is that in I the signals 
f(t — kT.) are time translates of each other, but in II the orthogonal 
signals V, are not necessarily time translates. A difference in operation 
between the two schemes is seen in the second requirement. In Scheme I 
the overall channel characteristics are controlled by the transmitting 
and receiving filters. But, in Scheme II, they are controlled by the 
precoding matrix. To illustrate this and also for use in Section IV, 
we derive the impulse responses of Scheme IJ. As shown in Fig. 3, 
the correlator can be implemented with a tapped delay line and N 
sets of attenuators. Only the jth set of attenuators is shown. The 
attenuation ratios V;,, --+ , Var shown are the values of the elements 
of V; , and the decision statistic X’V; is obtained by sampling the output 
of the jth summing circuit. For analytical purposes, the tapped delay 
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SUMMING 
CIRCUIT 


Fig. 3— Diagram defining hi; (t). 


line, the jth set of attenuators, and the jth summing circuit together 
are equivalent to a matched filter having impulse response V;(t — 2), 
where ¢ is the sampling instant and V,(é) is a signal whose time sample 
vector is V; . Now define 


h.;(é) = output of the jth summing circuit when 
a; = 1 is applied to the precoder. (15) 


Since a; is transmitted by the signal V; or V;(¢), we have 


One / ; VO Leaay Gade (16) 


From (8) and (9) 


N 
V; = oe rjc, ° (17) 
- bed 
From (16) and (17) 


N N 2) 
hgh = > Os eos h(tp + 7 — t — kT)h(r — IT) dr. (18) 
k=1 UL=1 -2 
It is seen from (18) that, for a given transmission medium, h,;(t) is 
controlled by the elements )\;; of the precoding matrix. Since changing 
the precoding matrix requires only changing attenuation ratios in the 
precoder and the correlator, the overall channel characteristics are con- 
trolled by resistive elements. 


lil. PRECODING FOR MULTIPLE-SPEED TRANSMISSION 


The transmission of a single block has been considered in Section II. 
Now consider the transmission of an infinite sequence of symbols. 
In Scheme IJ, a symbol sequence is divided into blocks with N symbols 
in each block. If the vectors h,, --- , hy are M X 1 as assumed in 
Section JI, the blocks can be transmitted sequentially at MT seconds 
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intervals without interference between each other and the data rate is 


R= © Base bauds, (19) 
where Ryax is the Nyquist rate. 

Theoretically there is no limit on the block length N; however, 
we shall restrict N to be small number such as 3 so that the precoder 
and the correlator can be easily implemented. The parameter 17 must 
be restricted accordingly so that R [see (19)] can be a commonly used 
data rate such as 3/4 of the Nyquist rate. These requirements are 
satisfied by using the popular partial response channels.’** 

Table I of Ref. 3 illustrated five classes of partial response channels. 
T'rom the table, it is clear that if h(£) is in Class 1, then a set of sampling 
instants can be chosen (sampling time error will be considered later) such 
that h(t — T), --- , h(t — NT) are simultaneously zero at all except 
N + 1 adjacent sampling points. This means that the vectors h, , --- 
hy are each (VN + 1) X 1 so that 


) 


M=N+1 (20) 
N 
hs V1 Rinax bauds. 


If h(t) is in Class 2, or 3, or 4, sampling instants can be chosen such 
that M = N + 2. The rule can be easily extended to other classes. 

Consider now multiple-speed operation. As will be shown it is de- 
sirable to change the overall channel characteristics when changing the 
data rate. To make these changes, it is necessary to change the data 
format; however, it is desired that the system not be altered signifi- 
cantly otherwise (such as changing the filters, the equalization, the 
signaling interval, the receiver sampling time, etc.). 

The scheme developed allows the data rate and overall channel 
characteristics to be changed simultaneously by changing only the 
data format and some resistive elements. When the system operates 
as above, the data rate is (V/J/)R,,.z bauds and the sequence of symbols 


a,Qz eee Anan +1 eee 
is transmitted. If the particular channel is noisy, one may wish to reduce 
the baud rate so that signal energy per baud can be increased to combat 
noise (an adaptive technique). The data rate can be changed to 


R Riax bauds, (21) 


dap ae 
~ M 
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where r can be any integer from 1 to N, by inserting N — r zero digits 
into each block as follows 


a,°*° a, 0---Q p41 °** Aor QO---O Gora. ° °° 


and transmitting this sequence instead of the original symbol sequence. 
The r information symbols in each block are transmitted to the first 
rsumming circuits of the correlator at the receiver, while the N — r 
zero digits in each block are transmitted to the other summing circuits. 
For convenience, let us refer to the transmission path from the precoder 
to the jth summing circuit as the jth subchannel. Since there is no 
information transmission through the last N — r subchannels, it is 
no longer necessary to consider their performances. The precoding 
matrix A can be changed to improve the performance of the first r 
subchannels (such as reducing the effect of timing error). This can be 
done by changing the resistive elements in the precoder and correlator. 

To summarize, the multiple-speed transmission scheme has the 
following properties: 


(t) Changing data rate and overall channel characteristics requires 
only changing the data format and some resistive elements. There is 
no change to the filters, the equalization, the signaling interval T, 
or the receiver sampling time. 

(iz) With correct timing and the use of orthonormal signals, signal- 
to-noise ratio is maximized at each data rate for band-limited white 
noise under the constraints of fixed line signal power and no inter- 
symbol interference. 

(zit) By using partial response channels, commonly used data rates 
are easily obtained, using a physically realizable precoder and correlator. 


The discussions so far are general. To show how the method can be 
applied, and, more important, to demonstrate the advantage of changing 
the overall channel characteristics when changing the data rate, we 
consider in detail a two-speed transmission scheme in Section IV. 


IV. TWO-SPEED TRANSMISSION AND EYE PATTERNS 


Consider the following problem: The transmission medium is equalized 
for transmission at half the Nyquist rate and 


H(jw) = square root of full-cosine rolloff characteristic 


I 


k cos a ‘ |w| S rf. (22) 


0, otherwise, 


I 
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where k is a gain factor and f, is the bandwidth. It is recognized that, 
if Scheme I is used, the system is simply the popular full-cosine rolloff 
system transmitting at half the Nyquist rate. 

The channel can be utilized more efficiently if the transmission rate 
can be changed according to the noise level. To compromise between 
efficiency and equipment complexity we choose to consider here two- 
speed transmission and two common data rates, $ and 3 of the Nyquist 
rate. 

We consider in detail how Scheme II can be used for this purpose. 
Note that H(jw) in (22) is the Class 1 partial response system function. 
Therefore, from (20) and (21) 


R bauds, (23) 


Tr 
= Np a foe 
where r can be any integer from 1 to N. To obtain 4 R,,,. and # Rmax 
from (23), N can be 38, 7, etc. We choose N = 3 so that the precoder 
and correlator can be easily implemented. 

To obtain the higher data rate, the sequence of information symbols 
is divided into blocks with three digits in each block, where the nth 
block contains the symbols dsn4;, Qsn+2, aNd @3nz3. The blocks are 
applied to the precoder sequentially at 47 intervals. The precoder 
converts the symbols d3n41, Qsnz2, aNd @sn+3 In the nth block into 
numbers 03n+1, Oans2, aNd bans3 and transmits byny; at ¢ = (4n + 2)T. 

Consider the block containing a, , a, , and a3 . The precoder converts 
a, A, and a; into b,, b,, and b;, and transmits b,, b2, and bs; se- 
quentially at ¢ = T, 27, 3T. This produces, as discussed in the previous 
section, a signal at the receiver input as 


Xx = b,h, + boh. + bh; + N (24) 


where the time sample vectors h, , h, , and h; can be written as (omitting 
a gain factor and the common zero samples) 


1| 0 0 
1 1 0 
h, = ) h, a ] h, = ° (25) 
0 1 1 
0 0 1, 








Equation (25) shows that if sampling time is correct (timing error 
will be considered later), h, , h, , and h, are limited to a 47 time interval. 
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Since each block is transmitted in a 47’.time interval, there is no inter- 
ference between adjacent blocks. 

The vectors h, , h,. , and hs generate a three-dimensional real Euclidean 
vector space &. Let | 


Vi Var Var 
7 
Vi= Vie V, = V2 Vv; = Vaz (26) © 
Vis Vos V3 
Via Vou Vas 


be a set of orthonormal basis vectors for 6; and let a,, a2, and a; 
be transmitted as coordinates of V,, V2, and V3, respectively. Then 
the signal X at the input of the receiver must also be 


X = a,V, + a.V. + a;V3 + N. (27) 
The precoder structure then is 
b, Mic ha hee ie 
bo] = {Ay2 Ase Ase 
_b Nig og Ag3_} Ls 


where the \,;’s can be easily determined from (24), (25), (26), and (27). 
This precoder structure can be easily realized (Tig. 4). 

The correlator at the receiver which computes the decision statistics 
X’'V,, X’V,, and X’V; can also be easily realized (Fig. 3, 7 = 1, 2, 3; 
M = 4). | 

It is clear from Figs. 3 and 4 that the precoding matrix can be changed 
by simply changing the resistive elements (the attenuators) in the 
precoder and the correlator. 

The transmission rate is 3/4 Ruax when the system operates as above. 
To change the transmission rate to 1/2 R,,,x , zero digits are inserted 
into the original data sequence as follows 





(28) 


. Qe 


- 4, a2 0 az ag O53 a O-:::, 


and this new sequence is transmitted instead of the original sequence. 
Making use of the reduced baud rate to improve system performance, the 
overall channel characteristic is adjusted simultaneously by changing 
the precoding matrix. This is the subject of the following section. 


4.1 Timing Error and Eye Opening 


So far we have not specified which set of orthonormal basis vectors 
should be used. This is because with perfect timing the system meets 
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Fig. 4— Precoder for two-speed transmission where \i;, 7, 7 = 1, 2, 3, are at- 
tenuators. 


the signal-to-noise ratio criterion in Section III regardless of which 
set of orthonormal basis vectors is chosen. 

However, in practice, it is impossible to achieve zero sampling time 
error. In general, the receiver will sample the summing circuit outputs 
at t = ty + 6 instead of the correct time ¢) , where 6 is a random timing 
error. Then the system’s performance depends on the choice of V,, Vz , 
and V3, i.e., depends on the choice of A. To determine which A should 
be used, it is necessary to specify the type of transmission and choose 
a performance criterion accordingly. 

In the sequel, we consider digital data transmission. Eye opening 
is adopted as the criterion since it is a widely accepted, practical one* 
(although considering eye openings in the presence of timing error 
leads to a difficult nonlinear mathematical problem). 

Let r;(t) be the output of the 7th summing circuit when an infinite 
sequence of digits is transmitted at the higher data rate 3/4 Rua. Then 


wo 


rd) a > Agnaihii(t —_ 4nT) + > Gineghy lt > AnT), (29) 


At n=— 


where h;,(¢), as defined in (15), is the output of the jth summing circuit 
when a; = 1 is transmitted alone. From (18) and (22) it can be shown 
that 


h(t) ree Re ale Ni2Aje zs Nisdjs]T (4) = [AsoAj1 ae Nisdjo|L(t . T’) 
+ [Aide + Mids T(E + T) 
-+ Nsgdjl(t a 2T) + Narjsl(E + 27), (30) 
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where 


1 sin 2rf.(t — to) 


I) = 25G— it - Plo) 


(31) 


To evaluate eye opening of r;(¢) at t) + 6, we assume that the in- 
formation digits {a;} are binary and that each can be 3 or —3 (so that 
full eye opening = 1). Then 


E;(6) = Eye opening of r;(¢) at tf) + 6 


| his(to + 5) | — > | hilt) + 6 — 4nT) | 


n=+1 


foe) 


cas Ss 2 | hi(to + 6 — 4nT) | 


14#t n=—-o 


i,j=1,2,3. (82) 


Similarly, let r{() be the output of the 7th summing circuit when 
an infinite sequence of digits is transmitted at the lower data rate 
1/2 Rmax . Since zeros are inserted and no information digit is received 
at the third summing circuit, we need to consider only the eye openings 
H'(6) and £3(6) of r{(é) and ri(£), respectively. 


4,2 Selection of Precoding Matrix 


It is seen that at the higher data rate, we must consider simultaneously 
E,(6), #.(6), and £3(6), while at the lower data rate we need only to 
consider £/(6) and £3(6). This suggests that a different precoding matrix 
should be selected for each data rate. 

The steps in selection of the precoding matrix are lengthy and are 
outlined in the Appendix. The results are summarized here. 

The precoding matrix selected for the higher data rate is 


Aut = 0.21, Aue — 0.62, A413 = —0.5 
ror = — 0.68, doo = 0.48, dog = —0.68 (33) 
Asi = —0.5, Age = 0.62, 33 = 0.21. 


Eye openings obtained with this precoding matrix are given in Table I 
for 6 = 0, 40.17, 40.27, +0.37 (it is a reasonable expectation that 
the timing error 6 will amount to no more than +0.27). Also given 
in Table I is the eye opening £6) of the popular “raised cosine’ rolloff 
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TaBLE I—ComPaRISON OF EYE OPENINGS 


Timing Error 6 E(4) E2(8) E3(8) E(6) 
—0.3T 0.312 0.418 0.351 0.312 
—0.2T 0.559 0.625 0.575 0.551 
—0.1T 0.790 0.821 0.793 0.783 

0 1.000 1.000 1.000 1.000 
0.1T 0.793 0.821 0.790 - 0.783 
0.2T 0.575 0.625 0.559 0.551 
0.3T 0.351 0.418 0.312 0.312 


system” which transmits at the same data rate 3/4 Rmx (i.e., Which 
utilizes a 33.3 percent rolloff band). A glance shows that the eye openings 
(6), £,(6), and #3(6) are equal to or larger than the eye opening 
(6) of the conventional system. 

The precoding matrix selected for the lower data rate is 


1 
M75 Aw = 0, \Au3, = 0 


1 

Nor 0, Neo 0, os 4/2 ’ 

where \3;, As2, and 33 can be arbitrary since no information digit 
is transmitted through the third subchannel. With this precoding 
matrix, the system is identical with the popular “full cosine’ rolloff 
system at the lower data rate, and the eye openings E/(6) and E%(6) 
are both 1.00, 0.955, 0.896, and 0.823, respectively, for 6 equal to 
0, +0.17, +0.27, and +0.37. These eye openings are much larger 
than £,(6) and £,(6) in Table I. This clearly demonstrates the advantage 
of changing the precoding matrix when changing the transmission rate. 


V. CONCLUSIONS 


A precoding scheme is presented for multiple-speed digital or analog 
data transmission. The scheme has the following properties 


(2) Changing data rate and overall channel characteristics requires 
only changing the data format and some resistive elements. There is 
no change to the filters, the equalization, the transmitter signaling 
interval, or the receiver sampling time. 

(iz) With correct timing and the use of orthonormal signals, the signal- 
to-noise ratio is maximized at each data rate for band-limited white 
noise under the constraints of fixed line signal power and no intersymbol 
interference. 
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(i721) By using partial response channels, a number of commonly used 
data rates are easily obtained using a physically realizable precoder and 
correlator. 


Timing error is considered in a two-speed transmission scheme. Eye 
openings are used as the criterion in selecting the precoding matrix. 
liye openings obtained are equal to or larger than those of two conven- 
tional schemes transmitting at the same data rates. The study clearly 
demonstrates the advantage of changing the overall channel charac- 
teristics when changing the data rate. 


APPENDIX 


Selection of Precoding Matrix 


As can be seen from (32), (80), and (31), the problem of finding a 
precoding matrix for maximizing the eye openings in some joint sense 
over a certain range of the random variable 6 is nonlinear and mathe- 
matically intractable. In the following, we reduce the dimension and 
range of the precoding-matrix space $ = {A} to a minimum by using 
constraints and properties of $, then derive a guide for searching the 
reduced space. Eye openings are obtained equal to or larger than those 
of two conventional schemes transmitting at the same data rates. 

Consider the higher data rate. The eye openings /7,(6), (6), and 
E,(6) are determined by the nine parameters \;;, 7, 7 = 1, 2, 3. We 
have from orthogonality of V, , V2, and V3; 

hs;(t) = 0, z4,4 = 1, 2, 3; tj. (34) 

Define for 7 = 1, 2, 3 

ae pda e 

de: oa a a ca Ca 

It can be shown from (80), (31), and (35) that (84) is equivalent to 
the constraints 

CLG; = —W.W; — i, 1,j= ly 2 of 44. (36) 


Equation (86) is satisfied if and only if one of the following condi- 
tions holds 


(35) 


(2) WW, = =s, W.W; = = 3) W3W, = —% (37) 
(22) WW. 8S —}, WW; = —3, W;W, = —3 (38) 
(212) WLW, = —3; W.W; = 05 W3W, = —% (39) 
(iv) WW. = —%, W.W; < —3, W3W, = —i, (40) 
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Each of the four conditions specifies a subspace of S. Equation (37) 
corresponds to a null space because its requirements are conflicting. 
Equations (89) and (40) can be obtained from (38) by rotating the 
indexes of W; hence, for every point P in the subspace of (89) or (40), 
there is a point Q in the subspace of (38) such that P and Q produce 
eye openings differing only in indexes (for instance, P produces £,(6) = 
a(6), #,(5) = B(8), and #3(6) = y(4); Q produces L,(8) = (6), H2(6) = 
a(6), and £;(6) = 8(6)). Since they are the same set of eye openings, 
we need only to cover the subspace of (88) in searching for A. 

The subspace of (38) can be further narrowed. It can be shown that 
(38) holds if and only if 


WW. S —3, W.W; 2 0, —; = 1V,W, 30 (41) 
or 


WLW, 


IIA 


—i, — < WW; < 0, WW, = 0. (42) 


Equation (42) can be obtained from (41) by exchanging W, and W,. 
Thus, for the reason just cited, we need to search only the subspace 
of (41) instead of that of (38). | 

To further reduce 8, we divide the subspace of (41) into two subspaces 


(7) WW, s —}, WLW, 2 0, —-i<sW,W, S —-% (43) 

(12) WLW. s —-3, WW; = 0, —isW,W, S 0. (44) 
From (86), (44) can be written as 

COSS5, C20; . =2 5002 =. (44a) 


It can be shown from (82), (80), and (85) that simultaneously exchanging 
W,andC,, W.and C, , and W, and C; does not change the eye openings. 
From this it can be shown that for every point P in the subspace of 
(43), there is a point Q in the subspace of (44a) such that P and Q 
have eye openings differing only in indexes. Since (44a) is equivalent 
to (44), this implies that only the subspace of (44) needs to be searched 
instead of that of (41). 

The space S to be searched has been reduced to only that of (44). 
The W, — W; plane is reduced to a narrow strip for all W, ¥ 0. For 
instance, for W, = —1, W; is bounded between 0 and 4, and W, and 
Wz are bounded in the very narrow strip shown in Fig. 5. 

Each point (W,, W., Ws) in the subspace of (44) determines a 
precoding matrix through (36), (35), and the orthonormality condi- 
tion h;;(é)) = 1. . 
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Fig. 5— Region of We and JWs (shaded) when Wi = —1. 


Usually it is desirable that the three eye openings F,(6), £.(6), 
and £3;(6) be approximately equal. It can be shown from (82), (80), 
and (31) that #,(6) and £3(6) are approximately equal if 


W, = Cs ) W, =a Gy ) and W; = Ci ° (45) 
Equation (45) defines the following region in the subspace of (44) 
1 gee ee aye 
\Wil>5, We=-gprtq, Ws--gp 48) 


E,(6) is larger than E,(4) and E£3(6) at one extreme of the range| W, | > 3, 
and is smaller at the other extreme. Therefore, in the region of (46), 
there are points at which £,(6), #,(6), and £3(6) are approximately 
equal. A simple search of this one-dimensional region gives one of such 
points as 


W, = 0.84, W, = —0.92, W; = —0.3. 


This point gives the precoding matrix in (83). Table I in Section IV 
shows that by using this precoding matrix for the higher data rate, 
the system has eye openings equal to or larger than the eye openings 
of a “raised cosine’’ rolloff system transmitting at the same data rate. 

After the precoding matrix in (83) was obtained from the region of 
(46), the rest of the subspace of (44) was searched. About 5000 points 
were covered. It was found that no point had eye openings H,(6), (6), 
and £3(6) simultaneously larger than those in Table I. 

A similar study for the lower data rate produced the result in Sec- 
tion IV. 
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Axis-Crossing Intervals of Sine Wave Plus Noise 
By A. J. RAINAL 


I. INTRODUCTION 


Let I(¢, a) denote the stationary random process consisting of a 
sinusoidal signal of amplitude ~/2a and angular frequency q plus 
Gaussian noise, Z(t), of zero mean and unit variance. Thus, 


I(t, a) = V2a cos (gt + 6) + In(d). (1) 


6 denotes a random phase angle which is distributed uniformly in the 
interval (—7, 7). “a” denotes the signal-to-noise power ratio. When 
a = 0 Rice’ presented some theoretical results which are very useful 
for studying statistical properties of the axis-crossing intervals and the 
axis-crossing points of J(¢, 0) at an arbitrary level J. The axis-crossing 
intervals and the axis-crossing points of J(¢, a) are defined in Fig. 1. 
In recent work Cobb’ presented some theoretical results concerning the 
zero-crossing intervals, the axis-crossing intervals defined by the level 
I = 0, of Z(t, a). Some experimental and theoretical results concerning 
the zero-crossing intervals of I(t, a) were reported by Rainal.’ For the 
case when the power spectral density of Iy(¢) is narrow-band and 
symmetrical about the sine wave frequency, Blachman* presented some 


6, AND @j{ ARE AXIS~ CROSSING INTERVALS 


——>| 6, k-- 9, ---| 9, k-- %o ale 63 >| %3 k-~ 
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AXIS- CROSSING POINTS 


Fig. 1— The level J defines the axis-crossing points and the axis-crossing 
intervals of I(t, a) = 2a cos (qt + 6) + Iw(t). 
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theoretical results concerning the zero-crossing points, the axis-crossing 
points defined by the level J = 0, of I(é, a). 

The purpose of this brief is to present some theoretical results which 
are useful for studying statistical properties of the axis-crossing intervals 
and the axis-crossing points of J(é, a) at an arbitrary level J. These 
results stem from a straightforward extension of Rice’s’ analysis. 


II, THEORETICAL RESULTS 


Using a notation consistent with Refs. 5 and 6 we define the following 
probability functions at an arbitrary level J and arbitrary signal-to- 
noise power ratio “‘a’’: 


(zt) Q3(7, I, a)dr, the conditional probability that a downward axis- 
crossing occurs between t + 7 and ¢ + 7 + dr given an upward axis- 
crossing at ¢. 

(it) Q3(7, I, a)dr, the conditional probability that an upward axis- 
crossing occurs between ¢ + 7 and ¢ + 7 + dr given a downward axis- 
crossing at ¢. 

(itt) [U2(r, I, a) — Q2(r, I, a)]dr, the conditional probability that 
an upward axis-crossing occurs between ¢ + 7 and t + 7 + dr given 
an upward axis-crossing at t. 


This latter conditional probability is also equal to the conditional 
probability that a downward axis-crossing occurs between ¢ + 7 and 
t + 7 + dr given a downward axis-crossing at t. 

The reader should refer to Rice’ for the definition of all notation 
which is not defined in this brief. When a = 0, Rice’s’ (38) becomes 


T o 0 
Q2(r, I, a) = —[2cN,]" / dé / alt / dil lip(I, I, , 12,1), (2) 
-T 0 —o 
where N; = Rice’s’ equation (2.6) or (2.7) 
pL, Ii, 1,1 = @n°M™ 
1 : 
"exp ‘ah [Mo2(I1” an Ie) + 2M oor lil + 2D,Ii + 2h + PA} 


_ Mos Sage 
n= 7, Q= 2a 


D, = M,,[I — Q cos 6] + 1,3[Q cos (qr + 0) — I] 
+ M..Qq sin 6+ M.3Qq sin (qr + 6) 
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EL, = M,,[Q cos (qr + 6) — I] + M,,[2 — Q cos 8] 
+ M,..Qq sin (qr + 0) + M.:Qq sin 0 

= M,,{2I’—2QI[cos 6+ cos (qr+6)]+Q’[cos’ 6+ cos’ (gr+ 4)]} 

+ 2M,.Qq{[I — Q cos 6] sin 6 + [Q cos (qr + 6) — I] sin (qr + 9)} 

+ 247,,Q@{{I — Q cos 6] sin (qr + 6) + [Q cos (qr + 0) — I] sin 8} 

+ 2M,,{I[I — Q cos 6] + Q[Q cos 6 — I] cos (qr + 4)} 

+ M.(Qq)’"[sin’ 6 + sin’ (qr + 6)] + 2]723(Qq)* sin 6 sin (qr + 8). 
The M’s are given in Rice’s’ Appendix I with 


= 
| 


m(r) = / W(f) cos 2xfr df, (3) 
where W(f) = one-sided power spectral density of y(t). When J = 0, 


N,Q%(r, I, a) is equivalent to (9) of Cobb’s” recent work. 
Equation (2) can be put in a form analogous to Rice’s’ equation (47): 


Qi(r, I, a) = [4e°N]7'Mo(1 — m7? 
[exp (—G/2M) I(r 5 he ya) 40, (A) 
where 


1 = . 2 
rs sha ya) = ms de J yl — hay — be 


_e ty’ — nay 
2(1 — ri) 





he — Mz[1 = ri) ‘[D, —_ rf] E M.. m 





ko = —Ma ll — ri}, — 7D, ik a -} 
22 


4, = Mell -— ri] P2rn.D#, — Di — MI] + F,. 


Q; (7, I, a) is obtained from (2) by changing the signs of the @’s in 
the limits of integration. We find that Q7 (7, J, a) is equal to the right- 
hand side of (4) with hz, k, replaced by —h., —ky. 

[U.(7, I, a) — Q2(7, I, a)] is obtained from (2) by changing the lower 
limit of integration of [5 to -+- 0. We find that [U2(7, I, a) — Q2(7, I, a)] 
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is equal to the right-hand side of (4) with the function J(r,, hz, ke) 
replaced by the function J,(r; , he, kz), where 


Ji(r, , he , ky) = = dx i dy(x — he)(y — kz)e*’. (5) 


rVi1l—ry va 


The functions J(r, , he, kz) and J,(7, , ho , k.) are expressed in terms 
of Karl Pearson’s well-known tabulated function (d/N) in Ref. 5. 
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